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Preface 



The present book includes extended and revised versions of a set of selected papers 
from the Seventh International Conference on Informatics in Control, Automation and 
Robotics (ICINCO 2010), held in Funchal, Madeira - Portugal, from 15 to 18 June 
2010. The conference was organized in three simultaneous tracks: Intelligent Control 
Systems and Optimization, Robotics and Automation, and Systems Modeling, Signal 
Processing and Control. The book is based on the same structure. 

ICINCO 2010 received 320 paper submissions, from 57 countries in all continents. 
From these, after a blind review process, only 27 were accepted as full papers, of 
which 20 were selected for inclusion in this book, based on the classifications 
provided by the Program Committee. The selected papers reflect the interdisciplinary 
nature of the conference. The diversity of topics is an important feature of this 
conference, enabling an overall perception of several important scientific and 
technological trends. These high quality standards will be maintained and reinforced 
at ICINCO 2011, to be held in Noordwijkerhout, The Netherlands, and in future 
editions of this conference. 

Furthermore, ICINCO 2010 included 6 plenary keynote lectures given by Jose 
Santos-Victor (Instituto Superior Tecnico, Portugal), Alicia Casals (Institute for 
Bioengineering of Catalonia. IB EC and Universitat Politecnica de Catalunya.UPC, 
Spain), Bradley Nelson (Institute of Robotics and Intelligent Systems at ETH-Zurich, 
Switzerland), Wisama Khalil (Ecole Centrale de Nantes, IRCCyN, France), Oleg 
Gusikhin (Ford Research & Adv. Engineering, USA) and John Hollerbach 
(University of Utah, USA). We would like to express our appreciation to all of them 
and in particular to those who took the time to contribute with a paper to this book. 

On behalf of the conference organizing committee, we would like to thank all 
participants. First of all to the authors, whose quality work is the essence of the 
conference and to the members of the Program Committee, who helped us with their 
expertise and diligence in reviewing the papers. As we all know, producing a 
conference requires the effort of many individuals. We wish to thank also all the 
members of our organizing committee, whose work and commitment were invaluable. 

October 2010 Juan Andrade-Cetto 

Jean-Louis Ferrier 
Joaquim Filipe 
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Dynamic Modeling of Robots Using 
Newton-Euler Formulation 



Wisama Khalil 

Ecole Centrale de Nantes, IRCCyN UMR CNRS 6597, 1 Rue de la Noe, 44321, Nantes, France 

Wisama . khalil@irccyn . ec-nantes . f r 



Abstract. This paper presents the use of recursive Newton-Euler formulation to 
model different types of robots. The main advantages of this technique are the 
facility of implementation and obtaining models with reduced number of 
operations. The following structures are treated: rigid tree structure robots, 
closed loop robots, parallel robots, robots with lumped elasticity, robots with 
moving base and wheeled robots. 

Keywords: Dynamic modeling, Newton-Euler, Recursive calculation, Tree 
structure, Parallel robots, Flexible joints, Mobile robots. 



1 Introduction 

The dynamic modeling of robots is an important topic for the design, simulation, and 
control of robots. Different techniques have been proposed and used by the robotics 
community. In this paper we show that the use of Newton-Euler recursive technique 
is easy to develop and implement. The proposed algorithm can be extended to many 
types of structures. In section 2 we recall the method used to describe the kinematics 
of the structure, and then in section 3 we present the inverse and direct dynamic 
models of tree structure rigid robots. The following sections present the generalization 
to the other systems. 



2 Description of the Robots 

The structures of robots will be described using Khalil and Kleinfinger notations [1]. 
This method can take into account tree structures and closed loop robots. Its use 
facilitates the calculation of minimum number of inertial parameters [2]... [4], which 
reduce the number of operations of the dynamic models and are needed in dynamic 
identification and adaptive control laws. 

2.1 Geometric Description of Tree Structure Robots 

A tree structure robot is composed of n+1 links and n joints. Link is the base and 
link n is a terminal link. The joints are revolute or pris-matic, rigid or elastic. The 
links are numbered consecutively from the base, to the terminal links. Joint j connects 

J.A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 3-|20J 
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link j to link a(j), where a(j) denotes the link antecedent to link j. A frame Rf is 
attached to each link i such that (Fig. 1): 

• zt is along the axis of joint /; 

• X( is along the common normal between zi and one of the succeeding joint axes. 




Fig. 1. Geometric parameters for frame j 



The transformation matrix l hj , defining frame Rj with respect to (wrt) frame R( is 
obtained as a function of six geometric parameters ( Yj > bj > OCj , dj , Oj , r • ) such that: 

l hj = Rot( z, Yj ) Tran( z,bj ) Rot( x, 0Cj ) Tran( x,d .■ ) Rot( z, Oj ) Tran( z, r ; - ) 
After developing, this matrix can be written as follows: 



l hj = 






CTCe-STCaSG -CySG-STCaCe S-ySa dCy+rSySa 

STCG + CTCaSe -S-ySe + CTCaCe -CySa dSy-rCYSa 

SaS6 SaCe Ca rCa + bj 

1 



(1) 



Where l R defines the (3x3) rotation matrix and l Pj defines the (3x1) vector defining 

the position of the origin of frame j with respect to frame i. 

If Xj is along the common normal between Z[ and Zj, both yj and bj will be zero. 
The type of the joint is identified by <7y where <7y =0 if j is re volute, <7y = 1 if j 

is prismatic, and Gj = 1-Gj . The joint variable, denoted asg^ , is dj if j is re volute 
and Vj if j is prismatic. 

A serial structure is a special case of a tree structure where 
a ( j ) = j ~ 1? Yj ~ 0, bj = for allj = 1,.., n. 

2.2 Description of Closed Loop Structure 

The system is composed of L joints and n + 1 links, where link is the fixed base and 
L > n. The number of independent closed loops is equal to B = L - n. 
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The joints are either active (motorized) or passive. The number of active joints is 
denoted N. The location of all the links can be determined as a function of the active 
variables. The geometric parameters of the mechanism can be determined as follows: 

a) Construct an equivalent tree structure having n joints by virtually cutting each 
closed chain at one of its passive joints. Define the geometric parameters of the tree 
structure as given in section 2.1. 

b) Number the cut joints k= n+1, . . .,L, 

c) For each cut joint define two frames on one of the links connected by this joint. 

Assuming a cut joint k, and that the links connected by it are link i and link j, the 
frames are defined as follows (Fig. 2): 

- frame R^ is fixed on link j such that a(k)=i, the axis Zk is along the axis of joint k, 
and %k is along the common normal between Zk and Zj- The matrix l h k is determined 
using the parameters ( y k , b k ,a k ,d k ,6 k ,r k ) . 

- frame Rk+B i s aligned with Rk, but a(k+B)= j. The matrix J h k+B is constant. 
The joint variables are denoted as: 



(2) 



• q tr tree structure joint variables, q c the cut joint variables, 

• q a , q p active and passive joint variables of the tree structure. 

Since R k and Rk+B are aligned, the geometric constraint equations for each loop, can 
be written as: 



<ltr 


><ltr = 


% 


_4c_ 




_q p _ 



k+B 



h; 



'h,. 



(3) 




777777T 
Fig. 2. Frames around a cut joint k 

The kinematic constraint equations are given by: 
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°V -°V 

(4) 

h<i bl =h+B<i b2 

Where Vy. defines the (6x1) kinematic screw vector of frame k, given by: 



v k 



v[ 4] (5) 



v k linear velocity of the origin of frame R^, a\ angular velocity of frame k; 

J^ the kinematic Jacobian matrix of frame k; 

q ,q joint velocities through the two branches of the loop. 

3 Dynamic Modeling of Tree Structure Robots 

3.1 Introduction 

The most common methods used to calculate the dynamic models are the Lagrange 
equations and the Newton Euler Equations [5],. . .[7]. The Lagrange model is given as: 

r= A(q)q + H(q,q) (6) 

where A is the inertia matrix of the robot and H is the Coriolis, centrifugal and gravity 
torques. 

Calculating Tin terms of(q,q,q) is known as the inverse dynamic problem, and 
calculating gin terms of (q,q,T) is known as the direct dynamic model. The inverse 
dynamic model is obtained from (6), whereas the direct model is calculated as: 

^{Afir-H) (7) 

The recursive Newton-Euler algorithms have been shown to be the most efficient 
technique to model rigid robots [8]... [11]. In [12] a recursive Lagrange algorithm is 
presented but without achieving better performances than that of Newton-Euler. 

The Newton-Euler equations giving the total forces and moments on a link j about 
the origin of frame j are written as: 



J ^j = W 



J a>jx(1l oj 1a>j) 



(8) 



where 



&>; angular velocity of link j ; V • contains the linear acceleration of the origin of 

frame j, denoted vj , and the angular acceleration co-, . 

O, total wrench (forces and moments) on link j at origin of frame Rf, 
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/; (6x6) inertia matrix of link j, which is given as: 



J J ■ = 



Mjl 3 



- j MS; 



JMS J JI oj 



(9) 



Where Mj, MS j and 7 • are the standard inertial parameters of link j. They are 
respectively, the mass, the first moments, and the inertia matrix at the origin. 

3.2 Calculation of the Inverse Dynamics Using Recursive NE Algorithm 

The algorithm consists of two recursive computations: forward and backward [8]. The 
forward equations, from 1 to n, compute the velocities, accelerations and the dynamic 
wrench of links. The backward equations from n to 1, provide the joint torques. The 
algorithm will be denoted by: 



r=NE(q,q,q,F e ) 



(10) 



Where F e is the external wrench exerted by the links of the robot on the environment. 
The forward equations for j = l,...,n are as follows: 



J co j = J R i 'co i +a j J z j qj 



(11) 



JV j = JT i i V i +qjJa j + 



j R, 



' (qxr (tyX 1 Pj j +2cTj(-'(O i Xqj ->Zj ) 



ajJ^XqjJzj 



(12) 



^j = w 



icOjX^ajXJMSj) 
>a>jx( J I oj J (Oj) 



Where 



(13) 



-i = a( j ), 3 Zj =[0 7] , J cij = <Jj Gj\ , w defines the skew matrix of 
vector product associated to the (3x1) vector w such that wxv = wv. 
- J T t the (6x6 ) screw transformation matrix: 



J T, 



% J Pi J Ri 



J 3x3 



J R> 



(14) 



These equations are initialized by Ofy = 0, cqq = . The gravity effect on all the links 
is taken into account by putting v = -g . 
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Fig. 3. Forces and moments acting on link j 

The backward recursive equations are deduced from the total forces and moments 
on link j around its origin (Fig. 3). They can be calculated for j = n,...,l 

W£ k 7j Tk F k + J F ej 

k (15) 

F J = >F f >a i + l aj 4j + F sj si g n (<ij) + F vj 4j 
Where a(k )= j , F; is the reaction wrench (forces and moments) of link i on 
link j , F e j is the wrench (force and moment) exerted by link j on the environment, 
la j is the gear and rotor inertia of actuator j and F s j , F v j are the coulomb and 

viscous friction parameters. 

The computational cost of this algorithm is linear in the number of degrees of 
freedom of the robot. To reduce the number of operations of the calculation of this 
model the base inertial parameters and the customized symbolic calculation can be 
used [9]. ..[11]. 

3.3 Computation of the Direct Dynamic Model 

Two methods based on Newton-Euler methods can be used to obtain the dynamic 
model: the first is proposed by Walker and Orin [13], it is based on calculating the A 
and H matrices, defined in (6), using Newton-Euler inverse dynamic model and to 
calculate the joint accelerations by (7); the second method is based on a recursive 
algorithm [14] [16]. 

3.3.1 Using the Inverse Dynamic Model to Calculate the Direct Dynamic Model 

From (6) and (10) we deduce that H(q,q) is equal to r if q=0 , and that the i th 
column of A( q ) is equal to T if: q = u t ,q = 0,g =0, F e j = , where u t is the (nxl) 
unit vector whose i th element is equal to 1, and the other elements are equal to zero. 



Dynamic Modeling of Robots Using Newton-Euler Formulation 



3.3.2 Recursive NE Computation of the Direct Dynamic Model 

Using (13) and (15) the equilibrium equations of link j can be written as: 

k 

where k denotes the links articulated on link j such that a(k)=j, and 

j 0)j x( j o)j x j MSj) 



(16) 






j a>jx( j lOj j a>j) 



(17) 



The joint accelerations are obtained as a result of three recursive computations: 

i) first forward computations for j = 1, ..., n: in this step, we compute the screw 

transformation matrices J T t , the link angular velocities J cVj and J Yj, J /?• vectors, 

which appear in the link accelerations equation (12), and the link equilibrium equation 
(17); 



Y . 



j R i [ i co i x( i co i x i P j )y2a j ( j co i xq j j z j ) 



a j J m i xq j U j 



(18) 



ii) backward recursive computation [6]: in this step we calculate the elements J H • , 

J Jj , J j3 j , J Kj , J a j which express qj and l Fj in terms of l V t in the third recursive 
equations. Thus for j = n, ... ,1 compute: 



Hj = ( j aj j J* j aj+ Iaj) 
j K 



j J*- j J* j a-H/ j a T : j J 



J J 



J OC: 



J J 



(19) 



-j-'Kj'rj+'Jj'ajHjUTj+'ayPjh'flj 
If a( j )^0 calculate also: 



#= i p;- j T^a j 

*j*= *j* +JT? j K ; j T. 



(20) 



These equations are initialized by: J Jj = J J ' -, ] j5 .• = ] j5 .• for j=l,...,n 
Hi) third recursive equations. Since ( v = —g, tv =0 
give J V- and J Fj (if needed) for j = 1... n as follows: 



Hi) third recursive equations. Since ( v =-g, tv = ), the third recursive equations 
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*j = H J [ - Ja J Jj J (3T i ly i + 3 Yj ) + T J +Ja J J Pj ] 



J Fj = J K/T t l V t + J aj 



(21) 



where Tj = ^j~F sj sign(qj ) - F vj q. 

4 Inverse Dynamic Modeling of Closed Loop Robots 

The computation of the inverse dynamic model of closed loop robots can be obtained 
by first calculating the inverse dynamic model of the equivalent tree structure robot, 
in which the joint variables satisfy the constraints of the loop. Then the closed loop 
torques of the active joints r c are obtained by solving the following equation: 



o 



r tr (q tr 4 tr ,q tr )+W T A (22) 



Where A is the vector of Lagrange multipliers and W q tr =0 is the kinematics 

constraint equations between the velocities of the passive and active joints of the tree 
structure. The matrix Wean be obtained from (4). 
After developing we obtain: 

t dq p 

r c= G r tx (<ltr Atr > '4tr ) = F a + ^T~ F p ( 23 ) 

where: 

G= Mt IL= Mtr_ (24) 

Ma Ma 

r a and r p are the actuated and passive joint torques of the tree structure. 

From equation (23) we see that the active joints of the closed loop structure is 
calculated by projecting the tree structure torques r tr on the motorized joints using 

the transpose of the Jacobian matrix of the tree structure variables (or velocities) in 
terms of the active joint variables (or velocities) [6], [17]. 

There is no recursive method to obtain the direct dynamic model of closed loop 
robots. It can be computed using the inverse dynamic model by a procedure similar to 
that given in section (3.3.1) in order to obtain the matrices A c and H c such that: 

F c = A c (4tr Y4a + H c (itr 4tr ) (25) 

5 Inverse Dynamic Modeling of Parallel Robots 

A parallel robot is a complex multi-body system having several closed loops. It is 
composed of a moving platform connected to a fixed base by m parallel legs. The 
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dynamic model can be obtained as described in the previous section, but in this 
section we present a method that takes into account the parallel structure. To simplify 
the notations we present the case of parallel robots with six degrees of freedom. 
Examples concerning reduced mobility robots are given in [18]. 

The inverse dynamic model gives the forces and torques of motorized joints as a 
function of the desired trajectory of the mobile platform. 

Decomposing the robot into two subsystems: the platform and the legs. The 
dynamics of the platform is calculated by Newton-Euler equation as a function of the 
Cartesian variables (location, velocity and acceleration of the platform), whereas the 
dynamics of each leg i is calculated as a function of its joint variables (ft, ft, ft) for 

i=l,...,m. The active joint torques are obtained by projecting these two dynamics on 
the active joint axes. 

To project the dynamics of the platform on the active joint space we multiply it by 
the transpose of the robot Jacobian matrix, and to project the leg dynamics on the 
active joint space we use the Jacobian matrix between these two spaces. Thus the 
dynamic model of the parallel structure is given by the following equation: 



r = P P (0 P + F ep )+Z 



i=A d <la 



T 



r, (26) 



Where 

P is the wrench required to achieve the motion of the platform, calculated by 
Newton- Euler equation (13) with the platform inertial parameters. 
F external forces on the platform. 

Jp is the (6 X 6) kinematic Jacobian matrix of the robot, which gives the platform 
screw V P (translational and angular velocities) as a function of the active joint 
velocities: 

V P =J P q a (27) 

r t is the inverse dynamic model of leg i, it is a function of (ft ,ft ,ft ) , which can be 
obtained in terms of the platform location, velocity and acceleration, using the inverse 
kinematic models of the legs [19]. We note that ft does not include the passive joint 
variables connecting the legs to the platform. 

The calculation of J p is obtained by inverting J p , which is easy to obtain for most 
parallel structures. The detailed calculation of 3ft /dq a is given in [18]. 

6 Inverse Dynamic Modeling of Robots with Elastic Joints 

A tree structure robots with lumped elastic joints can be described using the method 
presented in section 2. Each joint could be elastic or rigid. 



12 



W. Khalil 



6.1 Lagrange Dynamic Form 

The general form of the dynamic model of a system with flexible joints is the same as 
(6), it can be partitioned as follows: 



r. 



~r r ~ 




~ A 11 A 12~ 


q r 


+ 


~H r (q,q) 


r f_ 




_ A 12 A 22_ 


?f. 


H f (q,q) 



(28) 



Where (q,q,q) are the (nxl) vectors of positions, velocities, and accelerations of 
rigid and elastic joints; 

H (q,q) is the (nxl) vector of Coriolis, centrifugal and gravity forces, 

A(g) is the (nxn) inertia matrix of the system, 

r r is the vector of rigid joint torques, 7~y is the vector of elastic joint torques. 

If joint j is flexible: 



F J=- A( lj k J=-(<lj 



-q rj) kj 



(29) 



where kj is the stiffness of the elastic joint, Aqj is the elastic deformation, q r j is 

the reference joint position corresponding to zero elastic deformation. 

The inverse dynamic model calculates the input torques and the elastic 
accelerations as a function of the joint positions, velocities and rigid joint 
accelerations. The definition of the direct model definition is similar to that of the 
rigid robots; it gives the joint accelerations in terms of the input torques. 

Eq. (28) can be used to calculate the inverse model, we have first to calculate the 
elastic accelerations from the second row, and then we calculate the rigid joint torques 
from the first row. 



6.2 Direct Dynamics of Systems with Flexible Joints Using Recursive NE 

The direct dynamic model of system with flexible joints can be calculated using the 
recursive direct dynamic model algorithm of rigid joints presented in section (3.3) 
after putting Fj = - Aqj kj for the elastic joints. We note that the recursive algorithms of 
sections 6.2 and 6.3 can be used in case of passive joints without actuators by putting 
the torque of these joints equal to 0. 

6.3 Inverse Dynamics of Systems with Flexible Joints Using Recursive NE 

The recursive inverse dynamic algorithm of rigid links cannot be used for system with 
flexible joints since the accelerations of the flexible joints are unknown. But it can be 
used to obtain the A and H matrices as explained in section (3.3.1), then we can 
proceed as explained in section 6.1 for the calculation of q* and7" r . 

To solve this problem we use three recursive steps [20] . 

i) The first forward step is the same as that of the direct dynamic model 
(section 3.3). 
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ii) The second backward recursive equations calculate the matrices giving the elastic 
accelerations q e and J F ' • in terms of l V t . These matrices can be calculated for 
j = n...l as follows: 

- if joint j is elastic: 

h j = Ja j Jj *j Ja J 

JK j = jJ h jJ l Ja j H J 1Ja j jJ l oo) 

*aj = j Kj J 7j + j fj j aj HjU-kjAqj +i a] '$ )- j P* 

- If joint j is rigid: 



J K = J J 



Ja j = JK j (Jr j + J a j q j )- J $ 
if a( j )^0 , calculate: 



(31) 



(32) 



The previous equations are initialized by: J J: = J J.-, J j3 • = J j8; . 

The third recursive equations (for j = /,..., n ) calculate #,- for the elastic joints and 
the joint torques for the rigid joints using the following equations: 

j Fj = j Kj j T t % + j GCj (33) 

- if j is elastic: 

qj = Hf[-Jaj JJ* ( % % + J 7j )- kj A qj +U] '>*] 

- if j is rigid 

r j = Jf j r ]<2 i + 7 «/' #/ + F sj si 8 n ( <lj ) + F v/ 4 j ( 35 ) 

7 Dynamic Modeling of Robots with a Mobile Base 

This section treats mobile robots which are composed of a tree structure with a 
moving base. It includes a big number of systems such as: cars, mobile manipulators, 
walking robots, Humanoid robots, eel like robots [21], snakes like robots, flying 
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robots, spatial vehicles, etc. The difference between all of these systems will be in the 
calculation of the interaction forces with the environment. In the previous sections the 
acceleration of the base is known as equal to zero, whereas in the case of moving base 
the acceleration of the base must be determined in both direct and inverse dynamic 
models. The inverse dynamic model, which is used in general in the control problems, 
can be used in simulation too when the objective is to study the evolution of the base 
giving joint positions, velocities and accelerations of the other joints. The direct 
dynamic model can be used in simulation when the joint torques are specified. 

We use the same notations of section 2 to describe the structure. The base fixed 

frame Rq is defined wrt the world fixed frame R w hy the transformation matrix w /zq . 

This matrix is supposed known at t = 0, it will be updated by integrating the base 
acceleration. The velocity and acceleration of the base are represented by the (6x1) 
vectors V and V respectively. 



7.1 General Form of the Dynamic Model 

The dynamic model of a robot with a moving base can be represented by the 
following relation: 



(36) 



^6x1 


= A 


v o 


+ H = 


~ A 11 A 12~ 


v o 


+ 


'Hi 


r 




_ 4 _ 




An A 22_ 


_ 4 _ 




H 2_ 



r (nxl) vector of joint torques, q (nxl) vector of joint positions, 

A is the (6 + n) x (6 + n) inertia matrix of the robot, 

His the (n + 6) x 1 vector representing the Coriolis, centrifugal, gravity and 

external forces effect on the robot. 

The inverse dynamic model gives the joint torques and the base acceleration in 
terms of the desired trajectory (position, velocity and acceleration) of the articulated 
system (links 1 to n) and the base position and velocity. At first, the inverse dynamic 
model is solved by using the first row of (36) to obtain the base acceleration: 



o T > 



%=-{A n y 1 {H 1 +A 12 q) 



Then the second row of (36), can be used to find the joint torques: 

r = Aj 2 °V +A 22 q + H 2 



(37) 



(38) 



The direct dynamic model gives the joint accelerations and the base acceleration in 
terms of the position and velocity of the base and the articulated system and the joint 
input torques. Thus using (36), the direct dynamic model is solved as follows: 



^0 

4 



-Hj 

r-H 7 



(39) 



The calculation of A and H can be done by Lagrange method. They can also be 
calculated using the inverse dynamic model of tree structure of section (3.2) and using 
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the procedure of section (3.3.1). The mobile base can be taken into account as 
follows: 

- the initial values of velocities and accelerations are calculated from those of the 
base as follows: Vq = V^ , v = v - g, co - (fy , where the subscript b indicates 
the base. 

- The backward recursive equations must continue to j = 0, where this new iteration 
will obtain the 6 equations of Newton-Euler equations of the base. 

Solving the inverse and direct dynamic problems using A and H is very time 
consuming for systems with big number of degrees of freedom (as the eel like robot). 
Therefore, we propose here to use a recursive method, which is easy to programme, 
and its computational complexity is linear in the number of degrees of freedom. 

7.2 Recursive NE Calculation of the Inverse Dynamic Model 

The inverse dynamic algorithm in this case consists of three recursive equations 
(a forward, then a backward, then a forward) [21]. 

i) Forward recursive calculation: 

In this step we calculate the screw transformation matrices, the link rotational 
velocities, and the elements of the accelerations and external wrenches on the links 
which are independent of the acceleration of the robot base (vq,cqq). Thus we 

calculate for j = l,...,n the following: J 7] , ] co ■ • as given in section (3.2). We calculate 

also ] Py ] Y\ using (17) and (18) they represent the elements of total forces and 
acceleration which are independent of the base acceleration. We define also: 

ii) Backward recursive equations: 

In this step we obtain the base acceleration using the inertial parameters of the 
composite link 0, where the composite link j consists of the links j, j+1, . . ., n. 
We note that (17), giving the equilibrium equation of link j, can be rewritten as: 



k 
Applying the Newton-Euler equations on the composite link j, we obtain: 



(41) 



t j~ J j V j Pj + lu L j\ J s(j) v s(j) Ps(j)j (42) 

s(j) 

Where s(j) means all the links succeeding joint j, that is to say joining j to the terminal 
links. 

Substituting for V s (\) m terms of J Vj using (12), we obtain: 
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s(j) 



(44) 



(45) 



S %) = * f; 'Wl T r% (43) 

r 

Where r denotes all links between j and s(j). 
Thus we obtain: 

i F =ij fly if 
J J J j 

with: 

i J c j = i J c j +Y, kT [ kjC k% 
k 

ip C j = ifj-j: k T[ k ti + k T[ k 4 k C k 
k 

Vj is the inertial matrix of the composite link j. 

For j = 0, and since F Q is equal to zero, we obtain using (44): 

%=(%)''% (46) 

To conclude, the recursive equations of this step consist of initializing V? = J /; , 

•7?J = J/?j and then calculating (45) for j = n, . . ., 0. At the end °V is calculated by (46). 

Hi) Forward recursive equations: 

After calculating V , the wrench J F; and the joint torques are obtained using 

equations (12) and (45) for j= 1,..., n as: 

¥ j = j r i V i + j f j 

. . (47) 

J J J j 

The joint torque is calculated by projecting J F; on the joint axis, and by taking into 
account the friction and the actuators inertia: 



F j = ]F ] ]a ) +I aj '4j + F sj si g n (qj ) + ^vj 4j 



(48) 



It is to be noted that the inverse dynamic model algorithm can be used in the dynamic 
simulation of the mobile robot when the objective is to study the effect of the joint 
motions on the base. In this case the joint positions, velocities and accelerations 
trajectories are given. At each sampling time the acceleration of the base will be 
integrated to provide the angular and linear velocities for the next sampling time. 
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7.3 Recursive Direct Dynamic Model 

The direct dynamic model consists of three recursive calculations forward, backward 
and forward: 

i) Forward recursive equations: 

We calculate fovj = l,...,n the link rotational velocities using (11) and the terms J y; 

and J /?jOf Cartesian accelerations and equilibrium equations of the links that are 

independent of the accelerations of the base and the joints. 

ii) Backward recursive equations: 

In this second step, we initialize J /; = V; , J /?; = J /?; and then we calculate for j = 

n,...,l the following elements, which permit to calculate J F; and <?'• in terms of l Vj 
and will be used in the third recursive equations: 



1 /* = '/* + JTjTJJSTjJTj 



(49) 



Hi) Forward recursive equations: 

At first, the base acceleration is calculated by the following relation: 

%=(%)''% (50) 

'4j and J F; (if desired) are calculated for j= 1,..., n using the following equations: 

qj = H-f [- J a] h) ( % % + J rj ) + T J + 'fj J/3) ' 



(51) 



J Fj = JKj % % + J aj 



where: tj = rj -F sj sign(qj ) - F vj qj 

j V j = j T i i V i + j a j qj + J r j (52) 
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8 Dynamic Modeling of Wheeled Mobile Robots 

The robot is composed of a cart and classical wheels. The wheels could be fixed, 
steering or casters. Each of these wheels has a rotational variable <Pj . The steering 

and castor wheels have also an orientation variable /?,■ . The vector of configuration 

variables is composed of the wheel variables <Pj , f3j and the posture variables £ 

giving the position and orientation of the cart. To simplify the presentation we 
consider that the motion of the cart lies in a plane thus the posture is composed of the 
x,y position coordinates and the orientation is the angle around the z axis. For more 
details on these robots the reader can consult [22]. These mobile robots contain non- 
holonomic constraint equations between the configuration velocities and have some 
joints which are not motorized. The constraint equations can be written as: 

Wq = 0,and q = Sq m (53) 

£ ft <p \ , q m is the motorized joint variables among (3 and <p . 

The difference with respect to the previous sections is that the coupling between 
the configuration variables is kinematically defined. 

In this case we apply the procedure of closed loop structure of section 4 such that: 



r 

± m 



= r(q4,q)+W T A 



r 

± m 



dq 
dq t 



T (54) 



m J 



r is the torques corresponding to the configuration variables without taking into 
account the constraints, it can be calculated using the method of section 3.2, 
r m correspond to the real motor torques. 



9 Conclusions 

This paper presents the inverse and direct dynamic modeling of different types of 
robots. The dynamic models are developed using the recursive Newton-Euler 
formalism. The inverse model provides the torque of the joint and the acceleration of 
the free degrees of freedom such as the elastic joints, or the acceleration of the base in 
case of mobile base. The direct model provides the joint acceleration of the joints 
including those of the free degrees of freedom. 

These algorithms constitute the generalization of the algorithms of articulated rigid 
manipulators to the other cases. 

The proposed methods have been applied on more complicated systems such as: 
flexible link robots [23], Micro Continuous system [24] and Hybrid structure, where 
the robot is composed of parallel modules, which are connected in series [25]. 
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Abstract. This paper describes the Emotive Driver Advisory System (EDAS), 
Ford Research & Advanced Engineering's project on next generation driver as- 
sistance. EDAS integrates several emerging technologies, focusing on personal- 
ization and adaptive and intelligent behavior. We will provide a high-level 
overview of the EDAS architecture, focusing on novel consumer-facing fea- 
tures, such as the emotive spoken dialogue system and an Avatar as an automo- 
tive human machine interface. Furthermore, we will discuss the benefits of 
cloud-based vehicle infotainment and decision support and how this can be in- 
tegrated in a vehicle environment. The system concept was revealed at the 2009 
Consumer Electronics Show and the 2009 North American International Auto 
Show. 

Keywords: Driver aware vehicle, Computational intelligence, Spoken dialogue 
system, Embodied conversational agents, Affective computing, Cloud-based 
infotainment. 



1 Introduction 

This paper introduces a research project, the Emotive Driver Advisory System 
(EDAS), which explores next-generation driver/vehicle interaction with a focus on 
intelligent and emotive behavior. The goal of EDAS is to leverage recent advances in 
affective computing, embodied conversational agents, intelligent vehicle controls, and 
internet personalized recommendation services to create a new generation of driver 
assistance systems: we want provide an enhanced level of personalization and cus- 
tomization leading to an emotive bond between the driver and vehicle. 

In recent years, vehicle telematics and connectivity have started to play a very 
important role for automotive brand differentiation and competitive advantage. In 
2007, Ford, in collaboration with Microsoft, introduced an in-car communication and 
entertainment system, SYNC [1], which enables Bluetooth and USB connectivity for 
consumer phones and MP3 players and allows hands-free voice-activated control of 
brought-in devices. Leveraging consumer phone connectivity, subsequent SYNC 
versions encompass a set of cloud-based information services including vehicle 
health reports, traffic, turn-by-turn directions, weather, and news. The latest SYNC 
release enables seamless voice/button interface and control of smart phone mobile 
applications. 

J.A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 21-J36J 
springerlink.com © Springer- Verlag Berlin Heidelberg 201 1 
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To date, SYNC has focused on delivering functionality and usability. According to 
Maslow's hierarchy, consumers' needs from a product can be characterized using a 
three-level hierarchy from functionality and usability to pleasure [2]. Products that 
invoke a positive emotive response have a competitive edge in the marketplace. Clas- 
sic examples are Harley-Davidson motorcycles or IPod MP3 players. As a result, 
efforts to incorporate emotive aspects in functionality and services are actively being 
pursued to enhance product positioning. 

There are currently two main engineering disciplines that address human emotions: 
Kansei Engineering and affective computing. The design discipline to incorporate a 
user's emotive response in a product, Kansei engineering, was initiated in Japan in the 
1970s: it is also referred to as "emotive" or "affective" design [3]. Affective comput- 
ing is an emerging area that explores methods and tools to enable an HMI to recog- 
nize and understand human emotions, as well as the ability to express emotions [3]. 
Affective computing is based on recent technological advances in artificial intelli- 
gence, biomechanical sensing, computer vision, and speech recognition [4]. The rise 
of affective computing is closely tied to the recent development and popularity of 
humanoid robots. In 1995, SONY launched an entertainment robot called "AIBO," 
which has, among other features, the ability to recognize certain situations and to 
respond with pre-programmed emotions [5], [6], [7], and [8]. Affective computing 
can complement Kansei engineering, as it provides the methods and tools to directly 
capture emotive response from the user. 

Automotive companies are displaying a heightened interest in bringing emotions 
into the vehicle: this is because in addition to aesthetic value, emotions can directly 
impact driving safety and performance. For example, Toyota developed a concept 
vehicle call the POD in 2001, which can detect driver's emotion and express the 
driver's or its own feelings using multi-color light patterns on the front facial panel of 
the car [9]. Nissan developed a PIVO robot in 2005 and a PIVO 2 in 2007 that reside 
in concept vehicles: the robot is a dashboard talking buddy that can chat with the 
driver, provide various information services, and express emotions [10] [11]. 

The major motivation of the ED AS work described in this paper and the goal of 
our system was to bring the vehicle interface to a completely new level: from simple 
voice control to a personal assistant that can understand the driver and driving envi- 
ronment. This work was a research and development effort that falls into the broad 
scope of affective computing, intelligent control, and machine intelligence but with 
the focus on an automobile as the product platform. We demonstrate this in a func- 
tional prototype vehicle that offers the driver the experience of riding inside a car that 
is connected, intelligent, can understand your natural speech, talk back naturally, 
recognize and express emotions, and can offer personalized driving advice and 
assistance. 

The rest of the paper is devoted to the major elements of the ED AS system and its 
implementation. Specifically, Section 2 provides an overview of the ED AS architec- 
ture, Section 3 discusses how we use an Avatar as an advanced HMI, Section 4 de- 
scribes the emotive spoken dialogue system, and Section 5 provides an overview of 
cloud-based infotainment. The paper concludes with a summary of the ED AS system 
and its impact. 
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2 EDAS Architecture 

The EDAS architecture (see Fig. 1) is designed to facilitate personalization of the 
interaction between the driver and vehicle through customization and context- 
dependent learning and supports a mixed-initiative multi-task environment. The main 
elements of the EDAS system are an Intelligent Vehicle Agent, an emotive Spoken 
Dialogue System (SDS) supplemented by an Avatar, a Message Dispatcher, and a 
Task Manager. 
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Fig. 1. Diagram of EDAS architecture. 



The Intelligent Vehicle Agent aggregates and interprets vehicle data into driving 
context parameters and implements algorithms for perception and learning driver's 
actions and making intelligent decisions. The driving context includes parameters 
such as driving style (e.g. Sporty, Normal, or Relaxed) and workload (e.g. high, low). 
The driving style is estimated in real time by classifying the accelerator and brake 
pedal patterns ([12] and [13]) and the speed of the vehicle. The workload is estimated 
by a set of algorithms that continually evaluates the driving demand based on the 
driver control activity, traffic density, and road conditions [14]. The learning mecha- 
nism of the intelligent vehicle control system estimates the preferences of the driver 
under different situations and states (e.g. performance, fuel economy, and comfort) 
and summarizes them in families of models that characterize the driver's actions under 
different conditions. The driver's models are developed over time and are dependent 
on the environment, driver actions, and the operation of the vehicle. The models are 
later used to "invert" the mapped relationships and advise the driver on the most ap- 
propriate actions under specific circumstances. They can provide a recommendation 
for the optimal maximal speed under specific conditions or modes of operation (i.e., 
comfort vs. sporty, performance vs. improved fuel economy). 

The SDS manages the communication between the driver and the vehicle. The im- 
portance of an in-vehicle speech interface is related to requirements for non- 
destructive hands-free control due to the ever increasing number of auxiliary 
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functions offered in vehicles, such as telephones, entertainment, navigation, and cli- 
mate control systems. Most of these voice interface systems available today are based 
on a single utterance command-and-control paradigm. Such systems typically require 
memorization of all commands from a manual to be effective, not allowing natural 
language expression. Moreover, they require the user to interact in a hierarchical, 
menu-driven fashion, which does not allow simultaneous processing of tasks. To 
address these limitations, automotive companies and suppliers have been actively 
pursuing research and development of next generation, in-vehicle intelligent dialogue 
systems [15]. The ED AS SDS provides a mixed-initiative multimodal SDS that al- 
lows for many different types of input from speech and button presses to external 
barge-in messages. Our SDS is further enhanced by including emotion recognition 
and emotive text-to-speech. The ED AS emotive SDS is presented in more detail later 
in the paper. 

The ED AS SDS is further supplemented by an Embodied Conversational Agent 
(EC A), or Avatar, as a point of reference for vehicle intelligence. The ED AS EC A is 
a three dimensional digital cockpit assistant that can provide information and offer 
driving assistance through human-like interaction, including verbal (e.g. conversation) 
or non-verbal (e.g. expressions used to represent the status of active tasks) communi- 
cation. Our ECA offers several conduits for customization, including the Avatar's 
appearance, voice, and behavior (e.g. personality). 

The Message Dispatcher queues messages that arrive asynchronously from vehicle 
subsystems or external sources and forwards them to the SDS based on the current 
status of the system, priority of the message, and driving environment. Safety related 
or urgent navigation messages have higher priority and will be rendered to the SDS 
immediately. Informational messages, whether from the vehicle or from external 
sources, will be sent to the SDS only when the system is in idle mode and when the 
driver's workload is not high. 

The Task Manager provides an interface between the SDS and in-vehicle subsys- 
tems and/or connected applications from brought-in devices or from the cloud. It is 
responsible for translating and routing requests to the appropriate application, main- 
taining the status of the active request, receiving the response or proactive notification 
from the application, and transferring the message back to the dispatcher. 

In the rest of the paper, we focus on the following novel consumer-facing ele- 
ments: Avatar, emotive SDS, and cloud-based infotainment. Other ED AS components 
are outside the scope of this paper and are more appropriate for an in-depth discussion 
within their respective fields, such as Intelligent Controls or Software Engineering. 

3 Avatar as an Advanced HMI 

There is a growing body of research that demonstrates the benefits of utilizing ECAs 
and specifically animated humanoid Avatars in human-machine interfaces [16]. Using 
Avatars has recently been recently entertained within automotive applications. For 
instance, Volkswagen implemented an Avatar as a digital assistant that can give ani- 
mated narratives of various functions/features of the vehicle's instructional panel, 
such as its built-in navigation and climate control systems [17]. Mercedes-Benz de- 
veloped a similar application employing an Avatar to give the driver valuable vehicle 
information [17]. 
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The ED AS ECA is a three dimensional (3D) Avatar that can be efficiently imple- 
mented in a cockpit environment (through an LCD display) and can accommodate 
legal and/or safety requirements: for example, the Avatar could be frozen or removed 
entirely from the screen when the vehicle is in motion. A 3D Avatar allows for realis- 
tic rendering using the latest computer graphics technologies and modern GPU hard- 
ware shader capability. 

There are several Avatar design options including a human, a cartoon character, an 
animal, or an object (see Fig. 2 for an example human and cartoon Avatar). Other 
options are to animate facial expressions only or to include body motions. 




Fig. 2. a) Human Avatar, Jackie; b) Cartoon Avatar, Toby. 



To gain insight into human interaction with computer-generated characters in a 
driving environment, we conducted a customer clinic to understand the following user 
preferences: 

• Are life-like or cartoon-like Avatars preferred? 

• Is a partial representation (head, neck and shoulder) Avatar sufficient? 

• Are human characters or animal/objects preferred? 

Thirty-five subjects participated in our survey: most were Ford Motor Company em- 
ployees located in Southwest Michigan, USA, approximately 38% of the subjects 
were under 40 years old, and three quarters of the subjects were male. Subjects were 
given a brief introduction of the ED AS concept and the role of an Avatar in the cock- 
pit. They were also shown some video clips of the early ED AS prototype system. We 
then asked them to answer a number of questions concerning their Avatar preferences. 

Fig. 3 provides a summary of the major results. We can see that the survey con- 
firms that the human character is the most preferred Avatar form, a partial representa- 
tion is sufficient for most people, and most subjects prefer a life-like Avatar to a car- 
toon-like one. As a result, we adopted a partial (head and shoulders), life-like human 
character as the ED AS Avatar. 

An important consideration in the Avatar design is a phenomenon called the "un- 
canny valley" [18]. The Japanese roboticist, Masahiro Mori, suggested in his 1970 
paper that a human's positive emotional reaction to lifelike robots will suffer a setback 
(the uncanny valley) when the lifelikeness of the robot approaches a certain level and 
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Fig. 3. Avatar survey results. 

will continue to do so until, eventually, the robot and real-human become undistin- 
guishable [19]. The uncanny valley hypothesis has since been extended to computer- 
animated realistic 3D characters widely used in video gaming software. 

The development of our Avatar was guided by the uncanny valley principle, as 
well as the requirement to balance realism and computational limitations of an em- 
bedded cockpit environment. Fig. 2 shows our ED AS Avatar, Jackie, and our cartoon 
Avatar, Toby, that were created in collaboration with Zabaware [20] and implemented 
using a commercial animation package Haptek [21]. 




Fig. 4. Several facial emotions of Jackie: listening, thinking/searching, and confused. 



Emotive Driver Advisor System (ED AS) 27 

As a result of our customer clinic, the uncanny valley principle, and computational 
requirements, we focused on Jackie Avatar. The Avatar model automatically performs 
lip syncing and allows for animation of facial emotions through a set of customizable 
state values. These values can be set in real-time in response to system needs, such as 
the emotive tag sent by the SDS. Fig. 4 shows facial expressions corresponding to 
different states of the system, such as listening, searching, and confusion in the case 
of low confidence speech recognition. 

4 Emotive Spoken Dialogue System 

A good SDS is the key to a driver's positive in-vehicle communication experience. If 
the driver cannot communicate his/her wants naturally and with low cognitive de- 
mand, this could have serious in-vehicle safety implications. Additionally, low quality 
recognition hinders the driver's emotive and practical experience and can lead to frus- 
tration. In this section, we will provide an overview of the ED AS SDS. 

The ED AS SDS is based on four main principles: 1) "open-mic" technology where 
the system is continually listening; 2) a non-hierarchical interface so that the user does 
not need to go through a multi-level menu structure; 3) a natural language speech 
engine so that the user can speak naturally to the system and not have to remember 
the right words in order to effectively command; and 4) a multimodal and mixed 
initiative framework that allows other inputs equivalent to speech, such as a button 
press or proactive massages from external sources. We developed the EDAS SDS 
using a similar framework as the one used for the International Space Station [22] and 
[23], which uses the open source Regulus development environment [24] and [25]. 

The EDAS SDS architecture is detailed in Fig. 5. The system is multimodal, and so 
we take inputs such as speech, button presses, or external messages filtered through 
dispatcher. In the case of speech, we utilize an open mic speech identifier that is con- 
tinually listening. This continual listening is activated by either a key word or an 
equivalent button press. After this is initiated, the system will be continually listening 
until the system is deactivated by a command word, a specific button command, or a 
system time-out. Speech, as a .wav file, is then simultaneously passed to a speech 
recognizer (e.g., Nuance [26]) and the emotion recognizer (e.g., Affective Media 
[27]). The speech recognizer has acoustic models that convert what is heard to a logi- 
cal form, which is then passed to the input manager. For emotion recognition, we used 
a voice-based recognizer from AffectiveMedia, Ltd. to processes the emotion detected 
in the .wav file. The recognizer analyzes acoustic features of the speech, such as 
pitch, volume, and rate and classifies them into different emotions [28]. The EDAS 
emotion recognizer was trained to classify boredom, happiness, surprise, hot anger, 
and sadness. We combined boredom and sadness into a "bad" mood and happiness 
and surprise into a "good" mood by assigning a numerical value that was mapped to 
the correct mood: this value is then used in SDS output manager to create an appro- 
priate emotive response. 

The input manager takes the logical form and the current information state and cre- 
ates a dialogue move: this provides a representation that can be understood by the 
dialogue manager. The dialogue manager receives the dialogue move from the input 
manager and the information state and creates an abstract action and a new informa- 
tion state: this new information state is passed back. The abstract action is then passed 
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to the output manager, which converts the abstract action to a concrete action, also 
taking into consideration the emotion. For example, if the driver asked, "What is the 
time?", then the abstract action might be something that is passed as (time, 8:30 pm) 
to the output manager. The concrete action created by the output manager would 
combine the abstract action with information on the personality, driving context, ag- 
gressiveness, and workload to create an emotive Text to Speech (TTS) and corre- 
sponding Avatar facial response. 



Open 

Mic 

speech 

identifier 



J± 



I 



Emotive 
TTS 



buttons 



notifications 



Speech 
Recognition 



Emotion 
Recognition 



Output 
Manager 



H 



input 
Manager 



Dialogue 
Manager 



information 
state 



Personality 



Driving Context, 
Aggressiveness, 
Workload 



Task 
Manager 



Fig. 5. ED AS Spoken Dialogue System. 



The emotive output is based on the following three elements: TTS, emotive tags to 
control the TTS (i.e., how it is said), and the script to generate the parameters for the 
facial mapping. The TTS created by the Output Manager is based on a personality, 
workload, driver's mood, driver aggressiveness, and driving context. The personality 
is predefined and can include personalities such as polite, humorous, normal, or rude: 
it is one of the custom configuration parameters. The driver's mood can be gauged 
from emotion recognition software, such as Affective Media. The driving context can 
be generated by the driving style and workload. We define the driving style as sporty, 
normal, or relaxed and workload as high or low. 

Emotive tags can be used to control how the text is said. To generate emotive text- 
to-speech, ED AS used the CereProc speech synthesis system [29] with a custom built 
voice. The CereProc speech synthesizer provides rich XML control of speech, includ- 
ing several predefined emotive tags such as happy, calm, cross, and sad that can be 
used to specify the desired emotive modulation of the speech on the fly [30] . 

The third emotive output element is the script that generates the parameters for the 
facial mapping. These parameters are used to set the Avatar emotion (i.e., facial 
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expression) state variables and are sent to Avatar animation engine for real-time emo- 
tion rendering. 

Table 1 provides an example of how the mapping can be done based on a subset of 
the inputs: workload, driver mood, and Avatar personality. So, if the driver had se- 
lected a polite personality, and asked the question, "What time is it?", the driver might 
get the following response under high workload: "It's 8:30." Under a low workload, 
the response might be, "I am happy to tell you that the time is 8:30." Under a high 
workload the responses are concise in order not to add to the driver's workload; under 
a low workload, the responses can be more extensive or provide additional related 
information. If the driver's mood is bad, the output emotion might be neutral, whereas 
if the mood is good, the output emotion could be happy or playful. 

Table 1. Example of input mapping to emotive TTS and Avatar emotion. 



Inputs 


Outputs 


Workload 


Driver 
Mood 


Avatar 
Personality 


TTS 


Emotion 


Low 


Good 


Polite 


I'd be happy to tell you the time. It is 8:30. 


Happy 


Low 


Bad 


Polite 


The current time is 8:30. 


Neutral 


High 


Good 


Polite 


It's 8:30. 


Happy 


High 


Bad 


Polite 


It is 8:30. 


Neutral 


Low 


Good 


Humorous 


I'm glad you're asking and not looking 
down at your watch. It's 8:30. 


Playful 


High 


Good 


Humorous 


In this time zone it is 8:30. 


Playful 



The ED AS SDS was further enhanced with dynamic grammar capability (i.e., 
grammar updated from a database at runtime) to support the customized cloud-based 
infotainment modules, such as personalized news or music. For example, the user 
could create news folders or music stations with any name they wanted and that name 
would be added to the recognition vocabulary at runtime. 

Although we implemented the ED AS SDS entirely as an on-board system, we rec- 
ognize the opportunity to leverage seamless integration of on-board and off-board 
speech recognition capabilities (see Fig. 6). In this hybrid implementation, it would be 
desirable that content pertaining to in-vehicle information and services be self- 
contained using on-board speech recognition and not rely on internet connection: 
requests for content or services residing on the cloud that already require internet 
connection could utilize powerful, off-board speech recognition to support potentially 
large vocabulary and grammar. Furthermore, this would enable new cloud-based 
services and their corresponding vocabulary grammar to be added anytime. 



5 Cloud-Based Infotainment and Decision Support 

One of the objectives of the ED AS prototype was to investigate new possibilities 
enabled by advances in connectivity to integrate internet-based resources to create a 
new class of personalized, interactive vehicle entertainment and decision support 
functionality. First of all, we wanted to leverage the growing number of recom- 
mender-based entertainment services, including news and music that are synergistic 
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with ED AS personalization goals. In addition, introduction of new services can hap- 
pen outside of the typical vehicle design cycle, and maintenance and upgrades of 
infotainment applications are simplified. Furthermore, the emerging cloud-based 
paradigm of software and platform as a service enables an economical and scalable 
approach to incorporate computationally intensive support. This section discusses the 
ED AS approach for cloud-based infotainment and provides an implementation over- 
view of some services for common in- vehicle activities. 
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Fig. 6. EDAS Cloud-based infotainment integration. 



Fig. 6 shows the EDAS cloud-based integration infrastructure. EDAS is connected 
to the internet using a 3G network card or cell phone with appropriate data plan. The 
vehicle has an associated agent that performs internet search tasks on behalf of the 
driver and exchanges data with the on-board system using an XML format. The agent 
manages connections to third party applications and services based on the user profile 
and the current context. The services are invoked either on-demand from the driver or 
by the internal logic that activates tasks at scheduled times or based on certain condi- 
tions. The context is defined by data that the vehicle periodically sends to the agent, 
including location, direction, speed, intended route, fuel level, fuel economy, driver 
operating pattern, OBD codes, etc.: this data updates the customer/vehicle profile. The 
agent also has access to other personal applications, such as calendars or social net- 
work sites. It can use this additional information to set up parameters in requests to 
other applications or to filter, sort, or modify data received from applications before 
sending it to the vehicle. 

In general, the agent can perform sophisticated data aggregation and inference to 
preprocess data before sending it to the vehicle, substantially enhancing decision 
support in a driving environment. For example, the driver may be looking for restau- 
rants within an hour drive along the given route. The system could estimate an 
approximate location where the driver should begin looking for restaurants, filter 
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restaurants that will be open at the estimated time of arrival, look for restaurants that 
match the driver's profile food preferences, and use social network collaborative fil- 
tering to sort the results according to social network rating. The resulting XML file 
could contain up-to date and detailed data for each restaurant including general busi- 
ness attributes, such as name, description, address and location, average price of meal 
for the given time of arrival (e.g. breakfast, lunch or dinner if applicable), as well as 
non-formal reviews from members of their social network or general customers. 

Computationally intensive models can be run on the cloud, providing an economi- 
cal way to run these models without increasing the software and hardware require- 
ments of the embedded vehicle platform. The Intelligent Refueling Advisor is one 
such application that demonstrates the functionality within ED AS and is described in 
detail in [31]. The Refueling advisor integrates the expected routes over time, esti- 
mated fuel consumption, current fuel level, and current and forecasted gas prices at 
individual gas stations along the given route, and generates a recommendation on the 
cheapest refueling option (i.e., when, where, and how much fuel to get). The system 
can provide recommendations both upon request, as well as proactively based on 
known and predicted behavior and preferences. For example, if driver's current fuel 
level is within a small range of the model's expected fuel level for the day, if the gas 
price today is cheaper than tomorrow's forecast, and if the driver has less than half a 
tank but more than the low fuel level, then suggest: "Would you like to stop for gas 
today since prices are predicted to go up tomorrow?" In addition to recommendations, 
the driver can query the system regarding the logic behind the given recommendation. 
For example, the driver could ask, "Why are gas prices forecasted to go up tomor- 
row?" The system could respond, "This effect is due to wholesale prices rising". 

Ubiquitous internet connectivity allows us to also prototype new forms of enter- 
tainment. Until now, the in- vehicle entertainment options were either 1) broadcast 
(i.e., radio, satellite) providing new content catering to groups and not to individuals 
or 2) personal media collections (i.e., MP3, CDs, tapes) that are limited to a preferred 
but already known context. The proliferation of web-based recommender systems and 
personalized entertainment services enable taking advantage of both approaches to 
provide new context to personal preferences of the given individual. In addition, 
cloud-based infotainment allows for a continuum of personalized information and 
entertainment services between in- vehicle and off- vehicle activities. 

The ED AS Personalized News Services is an attempt to create a next generation 
news radio with added personalization and interactivity. The personalized news con- 
cept has been widely explored by internet news providers, such as Google [32]. The 
ED AS prototype leverages Really Simple Syndication (RSS) aggregator services to 
bring in news articles arranged by different categories. ED AS loads blocks of 6 arti- 
cles at a time to be announced in the vehicle using the on-board TTS engine. The 
articles are represented in an XML format that includes the title and body of the arti- 
cle; it also may include additional metadata such as the author, source, time, key- 
words, and references to other relevant sources. The format of the news delivery is 
similar to many traditional broadcast radio services: first, the headlines of all articles 
are introduced followed by the articles in full. However, the ED AS news service al- 
lows the listeners to interact and guide news delivery: this allows the driver to skip 
articles outside of listeners' interests or change the order in which they want to hear it. 
Furthermore, the user can request specific clarification on certain terms/names in the 
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article, provided that the item of interest is specified in the keyword metadata: the 
system can access additional sources of information, such as Wikipedia, to provide 
details on such requests. Additionally, the user can bookmark, share the article, or 
vote on their interest in the given article. Although, the current implementation of the 
news source uses an RSS aggregator, this could be expanded to the other forms of 
internet news services. 

Music is another common form of in-vehicle entertainment that was prototyped in 
ED AS. In the last decade, there have been a growing number of web-based personal- 
ized music services and their mobile versions: Pandora [33], Last.fm [34], Musicov- 
ery [35], Slacker [36], etc. These services stream music from their vast music libraries 
based on the user's profile. These services allow users to create personalized music 
stations by first providing an example of their music interests (e.g. provide specific 
seed song name, artist, or genre) and then providing interactive feedback on each song 
selection. Specifically, the user may vote the song up or down or skip the song. These 
systems use the feedback to modify the user's profile, further improving the music 
selection. Each of the services may utilize different methods of selecting songs, such 
as collaborative filtering or nearest neighbor search within music parameter space. 

The ED AS prototype implemented music services using Pandora, which is based 
on the music genome database, where all the songs are associated with a set of around 
300-400 parameters that are set up by musicologists. Song selection is implemented 
using the closest neighbor algorithm from the user's profile. The ED AS Pandora inter- 
face (see Fig. 7) allows the user to select a specific station, vote up and down, or 
bookmark the song using either voice or button presses as input methods. 
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The personalization of cloud-based infotainment can be further enhanced by ex- 
ploring context-aware selection of infotainment content. ED AS keeps track of the 
driver's infotainment content selection, such as the given Pandora station or news 
category under different conditions and uses this information in an intelligent shuffle 
mode to provide automatic infotainment content selection (i.e., source and station) 
corresponding to the summarized driver's preferences and to the current conditions. 
This context-aware infotainment selection is based on a learning and reasoning algo- 
rithm that uses the Markov Chain probabilistic model [37]. The automatic selection 
feature is designed to work with the Entertainment Application when the infotainment 
system is in a smart shuffle mode. That is, the selection is driven by the system based 
on the driver's previously learned infotainment preferences, and the input from the 
driver is used only to reinforce / reject the recommended station selection (music, talk 
show, news etc.). When infotainment selection is controlled by the driver, the learning 
algorithm uses the driver's selections to update the transition probability model. 

6 Conclusions 

An increased level of electronics, software, computational and communication power 
in current vehicles enables new types of intelligent features that are based on the abil- 
ity of the vehicle to estimate road and traffic conditions and driver preferences and to 
adapt accordingly. This tendency of growing sophistication, autonomy, and intelli- 
gence of vehicle control systems marks a trend towards the creation of a "driver- 
aware" vehicle that appeals to the driver by offering multiple features that maximize 
driver's safety, performance, and comfort, while leaving full responsibility and con- 
trol of the vehicle to the driver. The goal is to not only satisfy the hypothetical nomi- 
nal driver, but to also meet the requirements of as many individual members of a 
customer group that is targeted by a specific vehicle model. Examples of such flexible 
features are the multiple powertrain and suspension performance modes, customizable 
and adaptable HMI, and vehicle infotainment. 

The ED AS system shows a new emotive and intelligent way for drivers and vehi- 
cles to communicate with each other. ED AS integrates recent advances in intelligent 
controls, SDS, embodied conversational agents, and affective computing in a vehicle 
environment. This system has demonstrated a new level of personalization and learn- 
ing that can lead to an emotive bond between the driver and the vehicle. 

We first demonstrated ED AS in a buck version (see Fig. 8) and subsequently im- 
plemented ED AS in a 2006 Edge. Implementation of a full featured ED AS system 
within the vehicle allowed us to experiment and experience the different features in a 
real driving environment. It also helped us to better understand what worked and what 
did not work, allowed us to demonstrate and convey the benefits of the features to the 
business more effectively, and to prepare more reliable specifications for production 
implementation. 

The ED AS concept was incorporated into the future SYNC vision unveiled as 
EVA (see Fig. 9) at the 2009 Consumer Electronics Show (CES) and included in the 
Lincoln C Concept Car at the 2009 North American International Auto Show 
(NAIAS) [38]. Furthermore, a number of features that ED AS pioneered are being 
released within upcoming versions of SYNC. 
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Fig. 8. ED AS in a buck environment. 




Fig. 9. EVA that was show in 2009 CES and NAIAS. 
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Abstract. This paper presents a method for automatically calibrating a fingernail 
imaging system used to detect arbitrary three-dimensional force on the human 
fingerpad. The calibration is accomplished using a Magnetic Levitation Haptic 
Device modified to apply calibrated forces through a flat plate to the fingerpad. 

The technique explained here is shown to accurately predict arbitrary shear 
force with RMS error of 0.3 N (which is 3% of the full range of ±5 N) and nor- 
mal force with RMS error of 0.5 N (which is 5% of the full range of — 10 N). 
The paper also demonstrates the model and explains some of the methods used 
to compensate for nonlinearities in the fingernail coloration response. 

Keywords: Force sensing, Haptics, Grasping. 



1 Introduction 

Fingernail imaging as a method for detecting force on the human fingerpad is based on 
the coloration effect illustrated in Figure [T] As the finger is pressed against a surface, 
the interactions between the bone and nail compress regions of the finger pulp and 
thereby constrict blood flow through the capillaries. This causes blood to pool in some 
regions while evacuating other areas, creating a band near the distal end of the fingernail 
which begins to whiten. Simultaneously, the rest of the nail except the lunula reddens. 
Additionally, areas of the flesh alongside the nail whiten. If, in addition to normal force, 
shear force is applied to the fingerpad, different patterns appear on the nail. 

This effect is fast, requiring less than one second to reach steady-state. The coloration 
response described here is broadly similar in the general population [ 1 ] and so studies 
of this effect are applicable to a wide range of people. Surprisingly, when individual 
calibration is applied, it is possible not simply to qualitatively predict the direction of 
the force, but also to quantitatively estimate the magnitude with a reasonable degree of 
accuracy. This paper presents the current method of calibration for fingernail imaging 
used to create a model for predicting fingertip force. 

Mascaro and Asada [2] first proposed fingernail imaging as a method for detecting 
force on the human fingerpad. That work used a combination of photodetectors and 
LEDs embedded in an on-nail sensor to illuminate and perceive the coloration changes. 
These wearable sensors, with appropriate calibration, were able to predict shear force 
with an RMS error of 0.5 N over a range of ±2.25 N and normal force with an RMS 
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Fig. 1. Illustration of Coloration Effect: (Left) Finger with no Contact (Right) Finger with IN 
Normal Contact Force. 

error of 1.0 N over a range of — 3 N 0. There are, however, two limitations in 
these sensors. Space restrictions prevent more than a few photodetectors being placed 
on the nail, which limits the amount of data that may be recorded from the technique. 
Additionally, the sensors must be custom manufactured for each test subject. 

To bypass these restrictions, the development of a high-resolution imaging technique 
was created [4]. This involves capturing images of the fingernail and the surrounding 
skin with a digital camera. This method was able to predict normal force with an RMS 
error of 0.3 N over a range of — 10 N. Later experiments improved this prediction error 
to just 0. 1 N when simultaneously predicting normal forces over a range of — 10 N and 
shear forces oriented either laterally (across the finger) or longitudinally (along the fin- 
ger) over a range of ±2 N 1 5 ] . Another important result of these initial experiments was 
that the green channel in an RGB image showed the largest response to the force. In ad- 
dition, it was found that the direction of the force could even be qualitatively estimated 
without individual calibration with 90% accuracy |6|. These levels of accuracy were 
found to hold even as the resolution of the images was reduced to 10 x 10, indicating 
that extremely high-resolution images were not necessary for accurate measurements. 
However, this method introduced new challenges, most notably positioning the finger 
relative to the camera and adjusting for differences in lighting. For the time being, these 
have been carefully controlled, but they will need to be addressed at a future time. 

Both methods have been shown to be accurate in predicting both the magnitude 
and direction of force, but require individual calibration. The automated calibration 
technique presented here is able to predict arbitrarily-directed shear force with an RMS 
error of ±0.3 N over a range of ±2.5 N and normal force with an RMS error of ±0.3 
N over a range of — 10 N Q. Section [2] of this paper demonstrates the automated 
calibration procedure. Section [3] contains a discussion of the analysis techniques used 
on the calibration data. This includes the method used for image registration as well as 
the model developed to represent the effect of force on the coloration effect. The least- 
squares method used to generate the values used in the model is discussed. Section [4] 
presents some results from the model and outlines possibilities for future work. 
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Fig. 2. Diagram of Calibration Setup. 



2 Calibration Procedure 



To achieve the stated accuracy of force prediction, individual calibration is necessary. 
This calibration procedure follows a general form: A finger is placed on a force sensor 
with a camera placed above. A series of forces is exerted between the finger and the 
sensor while the camera records images. Once the calibration is finished, these images 
are compared to the data obtained from the force sensor to form a model for predic- 
tion. Although in the past, manual calibration has been performed, requiring the test 
subject to exert forces, the fatigue that resulted limited a 20-minute session to capturing 
fewer than 100 images. To allow for faster calibration and reduce the fatigue on the test 
subject, an automated calibration procedure has been developed. 

Automated calibration requires a special setup such as that shown in Figure [2] The 
test subject's finger is placed in a passive restraint that restricts movement and adjusts 
the finger joint angles without affecting blood flow. The force sensor is attached to a 
force-controlled Magnetic Levitation Haptic Device (MLHD) [8]. A camera is placed 
above the finger to record the images. The MLHD applies the desired force to the 
test subject and records both image and force automatically. The characteristics of the 
MLHD are such that the calibration routine is capable of recording one image every 0.5 
seconds. In a 20-minute session with a few short breaks, more than 1,500 images may 
be collected. The actual calibration setup is shown in Figure [3 

The MLHD is controlled by means of a standard PIV force feedback controller with 
a feed-forward term. The signal from the controller is sent to the MLHD as a force 
command. This controller has a relatively slow step response, with a 98% settling time 
of 0.2 seconds. Since the calibration trajectories do not use steps in force but ramps, 
this is not a problem during calibration. 

The desired force space to be covered during calibration is a cone with a slope equal 
to the coefficient of static friction fi between the finger and the contact surface. Different 
calibration trajectories have been designed to cover the force space in different ways. 
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Fig. 3. Photo of Calibration Setup. 
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Fig. 4. Example Calibration Trajectories 




Two sample calibration trajectories are shown in Figured Some use a Cartesian (x-y- 
z) grid, while others use cylindrical coordinates (r-0-z). The distance between adjacent 
points on the grid, AF varies, as does the value used for /i, the slope of the outside 
edge of the cone. During calibration, each force level is held for 0.4 seconds to allow 
the camera time to record an image. Then a 0.1 -second ramp is used to transition to the 
next force level. 



3 Data Analysis 

Once collected, the calibration data must be processed to develop a model that may 
then be used to predict force. The processing takes place in two steps: Registration and 
Model Formulation. 
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Fig. 5. Finger Images Demonstrating the Need for Registration. 

3.1 Registration 

The problem of registration is evident in Figure [5] Here, four calibration images are 
shown. Although the viewing plane of the camera is still parallel to the plane of the 
fingernail, the position and orientation of the finger has shifted in each of the images. 
Before these images can be used to generate a model relating pixel intensity to force, 
they must be registered so that all images have the finger in the same position. One 
challenge of this task is that the finger is relatively featureless compared to typical 
registration subjects. Additionally, the calibration itself varies the intensities within the 
finger, potentially changing some features that might otherwise be used to register the 
nail. 

Although several registration methods have been explored, the current procedure 
uses what is known as the Scaled Rigid-Body Transformation. This takes advantage of 
the knowledge that the image plane is always parallel to the finger during calibration. 
Since the finger only rotates in-plane, it is can be shown that proper registration requires 
only translations and rotations about the z-axis of the finger. The process is illustrated 
in Figure [6] An original image is shown in part (a). The finger is found in the image 
through a connected component analysis and the major axis of the finger is determined 
(b). This axis is rotated to a predetermined orientation (c) and the image is cropped to 
include only the largest connected component (d). The image may then be resized as 
needed to correspond to the target resolution. 



3.2 Finger Model 

A model of the coloration response of the finger related to force on the fingerpad was 
first explored in [2]. This initial model was highly nonlinear and took into account the 
dynamic behavior of the fingernail and surrounding skin. Later experiments 1 9 ] have 



42 



T. Grieve et al. 





Fig. 6. Registration Procedure (a) Original Image (b) Largest Connected Component with Major 
Axis (c) Rotated Image with Bounding Box (d) Cropped Image. 



shown that the dynamic response of the fingernail is fast enough that it is not necessary 
for most purposes to account for this, and so the dynamic response is not considered 
here. 

Independence. The static nonlinear behavior of the fingernail may be categorized in 
one of two general types: independence and saturation. Independence refers to the way 
that certain regions' coloration response is independent of force in one, two or all three 
directions of force. For example, the lunula (the light-colored crescent- shaped area at 
the base of the nail) does not change color in response to any force. Other regions 
respond only to force in one direction. Research into methods for correcting for this 
type of nonlinearity is currently ongoing. 

Saturation. Certain regions of the fingernail and surrounding skin saturate at relatively 
low levels of force, while other areas continue to change color even as the normal force 
exceeds 10 N. When considering only one direction of force, it is relatively easy to 
see a sigmoid shape appear in the data. As multiple directions of force are added, the 
response becomes more difficult to visualize and it is more challenging to recognize 
patterns of saturation. 

The procedure to compensate for saturation is illustrated in Figure [7] First, the data 
is sorted to reveal the saturation. A fourth-order polynomial is fitted to the data. The 
maximum value of the gradient of that polynomial is found. The data points above 
and below that maximum value where the gradient reaches 20% of the maximum are 
found. The intensities of those two points are identified as the upper and lower saturation 
threshhold, respectively. If no upper or lower saturation limit is found, the minimum or 
maximum intensity of the pixel is chosen for the respective saturation limit. The data 
between these two limits is then chosen and fit using the linear model detailed next. 
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Fig. 7. Saturation Detection in One Force Dimension. 

Linear Model. The linear model of the finger is based on the assumption that the 
intensity of a given pixel (jpi) is a linear combination of the applied forces (f x , f y and 
f z ) with some offset. This results in an equation for the i\h pixel: 



Pi = a,i + bif x + Ci/j, + d»/ z 



(1) 



An m x n image represented by a pixel vector p = [pi P2 — - Pmn ] with a corre- 
sponding force vector f = [f x fy fz] may then be represented by a single matrix 
equation: 
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(2) 



To form a matrix equation representing an entire calibration set of k images, first the 
transpose is taken: 



[P1P2 ■■■Pmn] = [1 



The k equations are then stacked: 
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forming a matrix equation P = 
Ordinary Least Squares (OLS): 
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FA. These equations are then solved using standard 



A = (F T F) F T P 

The next section explains how the matrix A is used to predict force. 



(5) 



3.3 Prediction Model 

Different least- squares methods have been considered for use in analyzing the calibra- 
tion data. Although a Total Least Squares approach might seem appropriate, since there 
is error in both the force sensor readings and the camera intensity data, the force sensors 
used in these experiments are so accurate (±0.8mN) compared to the cameras that this 
method increases the prediction error. Currently, OLS is used to determine the model 
and Weighted Ordinary Least Squares (WLS) is used to validate the model and predict 
forces. 

Since it would be undesirable to train the model to artifacts in the data, it is ex- 
pected that many more images will be recorded than the minimum required to exactly 
determine all of the coefficients. In other words, k ^> m x n, where m x n represents 
the resolution after the image has been registered and cropped. This ensures that the 
situation is more like curve fitting and less like polynomial interpolation. 

To generate the model, the A matrix is partitioned as follows: 



A = 



where 



B^ 



a 2 



(6) 



(7) 



and 



B 
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(8) 



In addition to these two submatrices, the prediction model requires the co variance ma- 
trix of the calibration images U = cov(P). This prediction model may then be ex- 
pressed as f = C (p — a), or: 
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f=(B T ^- 1 B)" 1 ^- 1 B T (p-a) 



(9) 



This is recognizable as a weighted least squares solution to the equation p = Bf + a 
with weighting matrix 27 -1 . The matrix C = (B T 27 _1 B) ~ 27 _1 B T and the vector a 
are stored after calibration is completed for future use in force prediction. 

4 Results 

4.1 Model Validation 

Verification of the model is achieved using a set of images collected using the proce- 
dure detailed in Section [2 The Measured Force f m is stored at the time the images 
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Fig. 8. Validation Error for Subject 5. 
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Fig. 10. Validation Error for Subject 7. 
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Fig. 11. Validation Error for Subject 8. 



are collected. The images are registered using the procedure explained in Section [3TTI 
Equation [9] is used to obtain the Predicted Force fp for each one. This force is com- 
pared to f m and the RMS error in each direction for the set of images is determined. 
This error demonstrates the validity of the model in predicting forces when presented 
with new images. A plot of f m vs. fp for test subject 5 is shown in Figured! For this 
subject, 1300 calibration images were collected and another 200 validation images were 
set aside. The 1300 were used to form the model. In this case, the force was predicted 
on the validation images with an RMS error of 0.2 N in the x-direction, 0.3 N in the 
y-direction and 0.3 N in the z-direction. 
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The validation results for subject 6 are shown in Figure [9] For this subject, only 900 
training images were collected for use in forming the model. Another 200 validation 
images were collected. The predicted force on these validation images had an RMS 
error of 0.2 N in the x-direction, 0.3 N in the y-direction and 0.4 N in the z-direction. 

Subject 7's validation results are shown in Figure [TUJ 750 calibration images were 
recorded for this subject and 200 validation images tested the accuracy of the model. 
The RMS error of the validation image prediction was 0.2 N in the x-direction, 0.3 N 
in the y-direction and 0.4 N in the z-direction. 

The validation results for subject 8 are shown in FigureQj] For this subject, 1200 cal- 
ibration images were collected with 200 in the validation set. The x-, y- and z-direction 
RMS errors were 0.2 N, 0.3 N and 0.3 N, respectively. 

In total, 20 individuals have been tested using this automated calibration routine. The 
validation results are similar to those described above. The average RMS error for shear 
force (x- and y-direction) is 0.3 N (which is 3% of the full range of ±5 N). The average 
RMS error for normal force (z-direction) is 0.5 N (which is 5% of the full range of 
0-10 N). 

5 Conclusions 

This paper has shown a method of predicting force using fingernail imaging. The tech- 
nique demonstrated is capable of predicting forces in three dimensions with RMS error 
less than 0.5 N. Future work includes the application of this method to human grasp- 
ing exeriments. Success in this endeavor would allow grasping experiments to proceed 
without the need to instruct test subjects to precisely grasp objects exactly where the 
test object has been instrumented. Instead, a more natural grasp could be implemented, 
allowing the subject to comfortably position the hand in whatever manner desired. This 
would allow for the removal of one more potentially confounding variable in grasping 
studies, increasing their reliability. 
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Abstract. This chapter presents the design and the validation of the actuators 
control system for a morphing wing application. Some smart materials, like 
Shape Memory Alloy (SMA), are used as actuators to modify the upper surface 
of the wing made of a flexible skin. The finally adopted control law is a combi- 
nation of a bi-positional law and a PI law. The controller is validated in two ex- 
perimental ways: bench test and wind tunnel test. All optimized airfoil cases, 
used in the design phase, are converted into actuators vertical displacements 
which are used as inputs reference for the controller. In the wind tunnel tests a 
comparative study is realized around of the transition point position for the ref- 
erence airfoil and for each optimized airfoil. 

Keywords: Morphing wing, Shape memory alloy actuators, Bi-positional and 
PI control design, Numerical simulations, Experimental validation. 



1 Introduction 

Today, aeronautical transport is evolving at a very fast pace, as compared to the be- 
ginning of the aviation era; aeronautical traffic tripled during the last fifteen years, 
and by 2025, is projected to double from today's levels. This traffic is expected to see 
an estimated +3.0% increase in the number of passengers per year, to approximately 
1 billion by 2016; will be accompanied by a load factor increase of 81.7% as com- 
pared to today's values by 2025 [1]. This evolution will need the new technologies 
development in the design and building of modern aircraft equipped with active con- 
trol systems [2] . 

During the same time period, fuel cost increases will lead to a slowdown in the 
aerospace industry, which in turn will stimulate research to find technological solu- 
tions; this will specifically involve designing new fuel economy consumption meth- 
ods. A new green trend has indeed started to spread out from the automobile industry 
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into the aircraft industry, in which research is being carried out to reduce fuel con- 
sumption by reducing drag, which is directly related to the airflow type around the 
aerodynamic aircraft body design. The drag reduction concept is connected to the 
laminar flow and to the displacement of the transition point between laminar and 
turbulent flows towards the trailing edge [2] . 

Numerous studies show that the transition between laminar and turbulent flows is 
influenced by the shape of the wing airfoil. Aerodynamic studies from the beginning 
of the aviation history show that for a certain flight condition characterized by a given 
Mach number and a given Reynolds number, the airflow around a wing airfoil is 
laminar at the leading edge, but becomes turbulent at a certain point due to air viscos- 
ity. A turbulent flow is not desired because of its negative effect in terms of drag 
increase, which, over time, leads to high fuel consumption, and consequently, in- 
creased operating costs [2] . 

Many researches are made around the world in the new challenge field related to 
the morphing aircraft, with the purpose to improve operational efficiency, particularly 
by reducing fuel consumption ([3]-^[9]). Therefore, a lot of architecture were and are 
still imagined, designed, studied and developed, for this new concept application. One 
of these is our team project including the numerical simulations and experimental 
multidisciplinary studies using the wind tunnel for a morphing wing equipped with a 
flexible skin, smart material actuators and pressure sensors. The aim of these studies 
is to develop an automatic system that, based on the information related to the pres- 
sure distribution along the wing chord, moves the transition point from the laminar to 
the turbulent regime closer to the trailing edge in order to obtain a larger laminar flow 
region, and, as a consequence, a drag reduction. 

The objective of here presented research work was to develop an actuation control 
concept for a new morphing mechanism using smart materials, like Shape Memory 
Alloy (SMA), as actuators. These actuators modify the flexible upper surface of the 
wing, changing the airfoil shape. The morphing wing project was developed by Ecole 
de Technologie Superieure in Montreal, Canada, in collaboration with Ecole Poly- 
technique in Montreal and the Institute for Aerospace Research at the National Re- 
search Council Canada (IAR-NRC). 

To achieve the aerodynamic imposed purpose, a first phase of the studies involved 
the determination of some optimized airfoils available for 35 different flow conditions 
(five Mach numbers and seven angles of attack combinations). The optimized airfoils 
were derived from a laminar WTEA-TE1 reference airfoil [10], [11], and used as a 
starting point for the actuation system design. The transition point position estimation 
is made using the information received from a pressure system sensors (optical and 
Kulite types) equipping the upper face of the wing. Two architectures were developed 
for morphing system: open loop and closed loop. The difference between the two 
architectures is given by using or not using the position of transition point as a feed- 
back signal for the actuation lines control. Here described work was developed in the 
open loop phase; in this phase were made numerical and experimental studies related 
to the aerodynamics of the morphed wing, to the flexible skin realization, to the actua- 
tion system, to the control of the actuation system, and, also, to the real-time determi- 
nation and visualisation of the transition point using the pressure sensors system. 
Here, the pressure sensors using is limited to the monitoring of the pressure distribu- 
tion and of the RMS pressure distribution in the boundary layer [12]. 
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2 Structure of the Morphing Wing System 

The chosen wing model was a rectangular one, with a reference airfoil WTEA-TE1, a 
chord of 0.5 m and a span of 0.9 m. The model was equipped with a flexible skin 
made of composite materials (layers of carbon and Kevlar fibers in a resin matrix) 
morphed by two actuation lines (Fig. 1). Each actuation line uses SMA wires as ac- 
tuators. In the same time, 32 pressure sensors (16 optical sensors and 16 Kulite sen- 
sors), were disposed on the flexible skin in different positions along of the chord. The 
sensors are positioned on two diagonal lines at an angle of 15 degrees from centerline. 
The rigid lower structure was made from Aluminum, and was designed to allow space 
for the actuation system and wiring [12]. 



Flexible skin 
(morphed extrados) 



Trailing 
edge 




Support plate for 
actuation system Rigid 
intrados 



Cavities for 
instrumentation 



Fig. 1. General architecture of the mechanical model. 



Starting from the reference airfoil, depending on different flow conditions, 35 op- 
timized airfoils were calculated for the desired morphed positions of the airfoil. The 
flow conditions were established as combinations of seven incidence angles (-l°,-0.5°, 
0°, 0.5°, 1°, 1.5°, 2°) and five Mach numbers (0.2, 0.225, 0.25, 0.275, 0.3). Each of the 
calculated optimized airfoils must be able to keep the transition point as much as 
possible near the trailing edge [12]. 

The SMA actuator wires are made of nickel-titanium, and contract like muscles 
when electrically driven. Also, these have the ability to personalize the association of 
deflections with the applied forces, providing in this way a variety of shapes and sizes 
extremely useful to achieve actuation system goals. How the SMA wires provide high 
forces with the price of small strains, to achieve the right balance between the forces 
and the deformations, required by the actuation system, a compromise must be estab- 
lished. Therefore, the structural components of the actuation system must be designed 
to respect the capabilities of actuators to accommodate the required deflections and 
forces [12]. 

Each of our actuation lines uses three shape memory alloys wires (1.8 m in length) 
as actuators, and contains a cam, which moves in translation relative to the structure 
(on the x-axis in Fig. 2). The cam causes the movement of a rod related on the roller 
and on the skin (on the z-axis). The recall used is a gas spring. So, when the SMA is 
heating the actuator contracts and the cam moves to the right, resulting in the rise of 
the roller and the displacement of the skin upwards. In contrast, the cooling of the 



54 



T.L. Grigorie et al. 



SMA results in a movement of the cam to the left, and thus a movement of the skin 
down. The horizontal displacement of each actuator is converted into a vertical dis- 
placement at a rate 3:1 (results a cam factor cy=l/3). From the optimized airfoils, an 
approximately 8 mm maximum vertical displacement was obtained for the rods, so, a 
24 mm maximum horizontal displacement must be actuated [12]. 

Flexible skin 




Roller 



Rod 



Cam 




ic^j / / 

^^^^- ^ Compression Support plate for Three SMA wires 



Cooling Heating 



spring 



actuation system 



Fig. 2. The actuation mechanism concept. 



3 Control Design and Numerical Simulation 



The control of SMA actuators can be achieved, in principle, using any method of 
position control, but the specific properties of SMA actuators, such as hysteresis, the 
first cycle effect and the long term changes must always be considered. Starting from 
the established concept of the actuation system the operating schema of the controller 
can be organized as is presented in Fig. 3. Based on the 35 studied flight conditions a 
database of the 35 optimized airfoils can be built. Therefore, for each flight condition 
results a pair of optimal vertical deflections (dFi opt , dF 2 opt) for the two actuation lines. 
The SMA actuators morph the airfoil until the obtained vertical deflections of the two 
actuation lines (dFi rea b dF 2 reai) become equals with the required deflections (dFi opt , 
di^opt)- The morphed airfoil (real airfoil) vertical deflections in the actuation points 
are measured using two position transducers. The role of the controller is to elaborate 
an electrical current command signal for the SMA actuators on the base of the error 
signals (e) between the required vertical displacements and obtained displacements. 
Because the two actuation lines are identical the designed controller will be valid for 
both of them. 

The first phase of the controller design supposes the numerical simulation of the 
controlled actuation system. Therefore, a model of SMA actuator was required. In our 
system a non-linear model was used (a numerical finite element one) build by Prof. 
P. Terriault using the theoretical model of Lickhatchev [13]. The SMA model has as 
inputs the initial temperature of the alloy, the electrical current that heats the alloy and 
the applied force; the outputs are the displacement of the actuator and the temperature 
of the alloy during functioning. According to this model, to use the shape-changing 
characteristics the SMA needs to be initialized by an external force, which obliges it 
to go initially through the transformation phase and further to revert to the initial 
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Fig. 3. Operating schema of the SMA actuators control. 

phase through the cooling phase. Before these two phases, the control can't be real- 
ized, due to the intrinsic behavior of the SMA [13], [14]. 

Looking the wing as an object moving through the atmosphere, aerodynamic forces 
are generated between the air and the wing; these forces vary in function of the air- 
flow characteristics (Mach number, Reynolds number and a - angle of attack). Since 
the aerodynamic forces are suction forces, it tends to lift the skin and to shorten the 
SMA wire. Against the aerodynamic forces action the elastic force of the flexible 
skin. A gas spring is needed in order to counteract the aerodynamic forces, so that the 
resultant force that acts on the SMA wire is given by equation 

F = F +(F y -F )-c f . (1) 

SMA spring v skin aero ' f \ / 

To have the premises necessary to initialize the SMA actuators in any conditions, they 
are loaded by the gas spring even if there are no aerodynamic forces applied on the 
flexible skin. So, the equation (1) becomes 

F SM =(F , . +k . -8 k ) + (k t . -8 -F )-c f9 (2) 

SMA \ pretension spring h ' v skin v aero ' f ' V^V 

where 

F =F , . +k -8,, F. . =k t 8 . (3) 

spring pretension spring h ' skin skin v V- 7 / 



Fsma is the SMA resultant force, F spring - gas spring elastic force, F skin - elastic force 
produced by the flexible skin, F aero - aerodynamic force, F pretension - pretension force of 
the spring, c f - cam factor (1/3), k spring and k skin are the elastic coefficients of the 
spring, and of the skin, respectively, & h and 8 V are the horizontal and vertical actuated 
displacements. 

Implementing the SMA actuators model in a Matlab S -function, the simulation 
model in Fig. 4 was obtained. As can be observed, to control the SMA actuators, an 
adequate electrical current must supply it. The length of the SMA wires is a complex 
function of the SMA load force and temperature, the last one being influenced by the 
supplying current in time and by the interaction of the wires with the environment in 
theirs cooling phase (when the electrical supply is removed) [15]. 

The block "Mechanical system" in Fig. 4 was modeled accordingly with the equa- 
tions (1) to (3). As is shown in Fig. 2, the horizontal and the vertical actuated dis- 
tances are correlated by using the "cam factor" C/=l/3. Therefore, the aerodynamic 
and the skin forces (F aero and F skin ) are reflected in the SMA load force (F SMA ) with the 
same rate. The gas spring has a preloaded force of 1500 N and a linear elastic coeffi- 
cient of 2.95 N/mm. In simulations a linear elastic coefficient of approximate 100 
N/mm was considered for the skin. 
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Fig. 4. SMA actuators Simulink model. 

The envelope of the SMA actuator, obtained through numerical simulation for dif- 
ferent aerodynamic load cases, is shown in Fig. 5. 
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Fig. 5. Simulated envelope of the SMA actuator. 



As can be observed from Fig. 5, to obtain a skin maximum vertical displacement 
(8 mm) in absence of aerodynamic force, it is required a high temperature (approxi- 
mately 162°C) in order to counteract the spring force. Because the ability of the SMA 
wires to contract is dependent upon Joule heating to produce the transformation tem- 
perature required, the need in higher temperature is reflected by a need in higher elec- 
trical current. Due to the fact that the aerodynamic forces reduce the actuators load the 
required current and temperature values are decreased; i.e. for F aero =1800 N the need 
in temperature for the maximum vertical displacement obtaining is approximately 
90°C. From other point of view, the ability of the SMA wires to return to their origi- 
nal configuration is dependant upon the ability of the system to cool the wires. The 
simulated SMA model offers just summary information about this subject, the proper 
heating and cooling of the wires being observed only in the moment of a thermody- 
namic analysis of the physical morphing wing. The system architecture play a big role 
in the wire cooling by the convection process, and also the performances of the 
system can be negatively influenced by heat transfer from actuators to the other 
components. 
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According with Fig. 3, the integrated controller purpose is to control the SMA ac- 
tuators in terms of supply electrical current so that to cancel the deviation e between 
the required values for vertical displacements (corresponding to the optimized air- 
foils) and the real values, obtained from position transducers. As mentioned previ- 
ously, the design of such controller is difficult considering the strong nonlinearities of 
the SMA actuators characteristics, nonlinearities significantly influenced by the forces 
with which they are tense. The chosen design procedure consisted of the following 
steps: 

Step 1 : numerical simulation of the SMA model actuators for certain values of the 
forces in the system; 

Step 2: approximation of the system with linear systems in the heating and cooling 
phases using the System Identification Toolbox of Matlab and the numerical values 
obtained at the Step 1 ; 

Step 3\ the choice of the controller type and its tuning for each of the two SMA actua- 
tors phase - heating and cooling; 

Step 4: the integration of the two obtained controller in a single one followed by its 
validation for the general model of the system (non-linear). 

Because the team that established the actuation line architecture ([16]) suggested that 
the pretension force of the gas spring must have the value F pretension =1500 N, F aew = 
1500 N value was chosen in numerical simulations for the aerodynamic force. Simu- 
lating a cooling phase followed by a heating phase with the model in Fig. 4, the blue 
characteristics depicted in Fig. 6 were obtained. In the first graphical window of the 
figure is presented the SMA wire length changing in time (8^), while in the second 
window the SMA wire temperature values in the two phases are shown. One observes 
that a SMA wire dilatation results in the cooling phase, and a wire contraction is ob- 
tained in the heating phase. For a horizontal actuation distance of approximately 24 
mm the wire temperature reaches a value near by 100°C. Note are the transient time to 
reach the steady-state values for the two phases: approximate 60 s for the cooling 
phase and approximate 40 s for the heating phase. For the steady-state, after the cool- 
ing phase, the numerical simulation obtained forces were: F SMA =1000 N, F skin =0 N 
and F spring =1500 N. In this steady-state the system is relaxed in terms of mechanical 
and the vertical displacement of the actuator is null. For the steady- state, after the 
cooling phase, the numerical simulation obtained forces were: F SMA =1331 N, 
F skin =266.l N and F spring =l51l N. This steady-state corresponds to the actuation sys- 
tem maximal vertical displacement of approximately 8 mm. 

Using the System Identification Toolbox of Matlab and the numerical values char- 
acterizing the 8h response at a series of successive step inputs, two transfer functions 
(TF) were found for the two SMA phases: 

_ 0.0177388 • s 2 + 0.004017 • s + 0.0241958 
* s 3 -1.43582 s 2 + 0.64742 s -0.001018 ' 

0.3535 -s + 0.2672 (j 

c s 2 - 1.9386 s + 0.01 1242' 

where H h (s) and H c (s) are the transfer functions for heating and cooling phases. The 
displacements b h , corresponding to the linear systems obtained through the two phases 
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identification, are depicted with red line in Fig. 6. A very good approximation can be 
also observed for the two phases through the identification in simulated conditions. 
The previously established transfer functions help to the controller type choice for 
each phase and to its tuning. 




300 



300 



Fig. 6. Actuator displacement and temperature vs. time. 



Considering the significance of physical controlled phenomenon, that the SMA 
wire must be heated to contract and then cooled to dilate by providing an appropriate 
electrical current by the control block, it is normal that in the cooling phase the 
actuators not be powered. This phase of cooling may occur in controlling not only a 
long-term phase, when it ordered a switch between two values of the actuator dis- 
placements, but also as a short-lived phase, which occurs when the real value of the 
deformation exceeds its desired value and is need to cool the actuator wires. On the 
other hand, it is imperative that in the heating phase actuators to be controlled so that 
the stationary error of the automatic system to be zero. Therefore, for this phase one 
opted for a simple architecture of the controller of PI type (proportional-integral). It 
combines the advantages of proportional type controller, which reduces substantially 
the overshoot and lead to a short transient time, with the benefits of the integral con- 
troller, which cancels the steady-state system error. As a consequence, the resulted 
controller must behave like a switch between cooling phase and heating phase, situa- 
tions where the output current is A, or is controlled by a law of PI type. The two 
phase's interconnection leads to an integrated controller, which can be viewed as a 
combination of a bi-positional controller (an on-off one) and a PI (proportional- 
integral) controller. 

The input-output characteristic of a bi-positional (on-off) controller can be de- 
scribed by the equation 
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i(0 = 



-i n , if e<0, 
i , if e > 0, 



(5) 



where /(£) is the command variable (electrical current in our case) in time, i m reflects 
the value of the command and e is the operating error (Fig. 3). The PI controller law is 
given by 



i(t) = K p ■ e(t) + 



K r \e(t) 



df, 



(6) 



with K P - the proportional gain, and K t - the integral gain. Combining the two control- 
lers in a single one, based on the rules previously mentioned results the control law of 
the integrated controller as the form 



i(t) = < 



10, if e < 0, 



K p -e(t) + K r je(t)-dt, if e > 0. 



(7) 



The optimal tuning of the controller in heating phase was realized using an integral 
criterion, the error minimum surface criterion, very well known in the literature as 
Ziegler-Nichols criterion [17]; the tuning methodology is: a) the regulator is consid- 
ered as a proportional one (P) and it is tuned with respect to the K P parameter; b) the 
amplification factor K P is increased until the response of the automatic system will be 
self- sustained oscillatory. One memorizes the value K P0 of K P for which the system 
has an oscillatory behavior and the value of oscillations semi-period (To). The optimal 
values for the parameters of the PI regulator are determined using the relations [17]: 



K P =0A5K P0 , ff 7 = ^7(0.85 -71). 



(8) 



Follows the controller tuning steps the next numerical values for the PI controller 
parameters were obtained and/or were calculated: 



K po =3984, T = 2.68s, 

K =1792.8, K= 787.0061. 



(9) 



As a consequence, the controlled system in heating phase can be modeled with an 
approximate linear system with the block schema in Fig. 7. 



Wished SMA 
elongation 



O- 



PI controller 



SMA heating 
model 



K p +- 



Current 



a 2 -s +a l -s + a 
L -s 3 +b n -s 2 +b, -s + b n 



SMA real 
elongation 



Fig. 7. The block schema with transfer functions of the heating phase linear model. 



The parameters a +a 2 and b +b 3 in the schema are the coefficients of the H h (s) trans- 
fer function nominator and denominator in ascending power of s (eq. (4)). The open 
loop transfer function of the controlled heating phase is 
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H l (s) = C p ,(s)-H h (s)= q > S i +q >* + ^ + q ° , 

o' v/ PI V ' « V ' 7 4,7 3,7 2,7 ' 

b 3 s + b 2 s + b x s +Z? s 
while the closed loop transfer function is 

q 3 s 3 +q 2 s 2 +q l s + q 



H cl (s) = C PI (s)-H h (s): 



(10) 



(11) 



The included coefficients are 

g 3 = /r p a 2 = 31.8021, q 2 = K p a x + K t a 2 = 21.1622, 
^ = K p a + /^ = 46.5396, q = K t a = 19.0422, 

respectively 

r 4 =b 3 = 1, r 3 = b 2 + /T p a 2 = 30.3663, 

r 2 =b x + K p a x + # 7 a 2 = 21.8096, 

r x = b + /T p a + ^ = 46.5386, r = /T 7 a = 19.0422. 



(12) 



(13) 



Cpi (s) is the transfer function of the PI controller. The poles of the close loop transfer 
function H d (s) result with the values 



p x = -29.6837, p x g R_, p 4 = -0.4453, p l e R 
p 2 = -0.1 187 + 1. 1943 i, p 2 eC, Re(/? 2 )eR_, 
/? 3 =-0.1187-1. 1943 -i, /j 3 gC, Re(/? 3 )e R_. 



(14) 



One can observe that all poles of the transfer function are placed in the left-hand side 
of the s-plane, and the obtained system is stable. 
In the state- space representation 



x(t) = Ax(t) + Bu(t), 
y(t) = Cx(t) + Du(t), 



(15) 



the state matrix A, the input matrix B, the output matrix C and the feed-forward matrix 
D, were obtained by the forms 



A = 



-30.3663 -21.8096 
1 

1 





-46.5386 -19.0422 




1 



(16) 



B T =[\0 0], C = [31.8 21.1 46.5 19], D = 0. 



Evaluating the controllability and observability of the system (P and Q matrices) 
results 
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1 -30.3663 900.3025 -26723.119 

1 -30.3663 900.3025 

1 -30.3663 

1 



(17) 



Q = 



31.8 


21.1 


46.5 


19 


-944.5 


-647 


-1460.9 


-605.5 


28035.4 


19139.3 


43352.4 


17986.3 


-832193.6 


-568090.6 


-1286744.7 


-533857.8 


rank(P) 


= rank(£) = 


system order = 


= 4. 



(18) 



(19) 



As a consequence, the system is completely controllable and observable. 

Based on the previously considerations, the final form of the integrated controller 
law is 



i{t): 



f 0, if e < 0, 



11792.8 -e(t) + 787.0061 ^e(t)-dt, if e>0. 



(20) 



Introducing the controller in a general block schema, with the non-linear SMA model, 
the Simulink model in Fig. 8 was obtained for the SMA actuators control (see Fig. 3). 
The input variable of the schema is the desired skin deflection and the output is the 
real skin deflection. 
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Fig. 8. The simulation model for the controlled SMA actuator with non-linear model. 



The "Integrated controller" block contains the implementation of the law de- 
scribed by equation (20) and of the preliminary observations related to the SMA ac- 
tuators physical limitations in terms of temperature and supplying currents. Its schema 
is shown in Fig. 9. The inputs of the block are the control error (difference between 
the desired and the obtained displacements) and the wires temperature, and the output 
is controlled electrical current applied on the SMA actuators. There are two switches 
in the schema; the first one chooses one of the two options in the control law (20) and 
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the second one switching the electrical current value to OA when the SMA tempera- 
ture value is over the imposed limit. Also, a current saturation block is used to prevent 
the current increase over the physical limit supported by the actuation SMA wires. 

Loading the simulated model with aerodynamic force F aero = 1500N, the charac- 
teristics in Fig. 10 are obtained for a 6 mm step desired skin deflection (5 V ). First of 
all, can be observed that the controller work good, the transition to the desired steady- 
state being significantly improved through the integration of the two control law in 
the equation (20): 1) the amplitudes of oscillations were reduced and the observed 
oscillations in the SMA temperatures around the steady-state are due only to the 
thermal inertia of the smart material; 2) the values of the transition time from 0mm to 
the steady-state values decrease from 20-^25 to approximate 5 s. 
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Fig. 9. "Integrated controller" block in Simulink. 
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Fig. 10. Response for a step input and F aero =l500 N. 
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4 Experimental Validation of the SMA Actuators Control 
for Open Loop Architecture of the Morphing System 

Starting from the theoretical and numerical simulation resulted considerations, to 
implement the controller on the physical model two Programmable Switching Power 
Supplies AMREL SPS 100-33, controlled by Matlab through a Quanser Q8 data ac- 
quisition card, were used (Fig. 11) [18]-r[21]. 

The power supplies have RS-232 and GPIB IEEE-488 as standard features and the 
technical characteristics: Power 3.3kW, Voltage (dc) 0-^100 V, Current (dc) 0-^33 A. 
The Q8 data acquisition card has 8 single-ended analog inputs with 14-bit resolution. 
All 8 channels can be sampled simultaneously at 100 kHz, with A/D conversion times 
of 2.4 us/channel, simultaneous sampling and sampling frequencies of up to 350 kHz 
for 2 channels. Also, the Q8 card is equipped with 8 analog outputs, with software 
programmable voltage ranges and simultaneous update capability with an 8 us set- 
tling time over full scale (20V) [21]. 
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Fig. 11. Physical model operating schema. 



The acquisition board was connected to a PC and programmed through Mat- 
lab/Simulink R2006b and WinCon 5.2 (Fig. 12). 

The input signals were two signals from Linear Variable Differential Transformer 
potentiometers that indicate the positions of the SMA actuators, and six signals from 
thermocouples installed on all the SMA wires components. The acquisition sampling 
time was set to 0.01 second. The outputs channels of the acquisition board were used 
to control each power supply through analog/external control by use of a DB-15 I/O 
connector. The current supplied to the actuator was set to be limited at 10 A, and the 
control signal was set to be 0-^0.606 IV (maximum voltage for the power supply is 2 
V for 33 A current supply). 

The operation principle of the physically implemented controller is relative simple. 
The initial input, which is the optimized airfoil for any flow condition, is chosen 
manually by the operator from the computer database through a user interface. Then 
the displacements (d7i, dY 2 ) that are required to be reproduced by the two control 
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Fig. 12. Simulink actuators control. 

points on the flexible skin are sent to the controller. This controller sends an analog 
signal - 2 V to the power supply that provides a current to the SMA. The SMA will 
respond accordingly and change its length according to the temperature of the wire. 
This will result in a change of the actuators positions, which are sensed by the linear 
variable differential transducer (LVDT). The signal position received from the LVDT 
is compared to the desired position and the error obtained is fed back to the controller. 
If the realized position is greater than the desired position the controller will discon- 
nect the control current letting the SMA wire to cool down. During the cooling down 
process the SMA will maintain its length due to the hysteretic behavior. This effect is 
taken into account for actuators displacement. Also the controller uses three thermo- 
couples signals from each SMA wire to monitor the temperature of the wires and 
maintain it below 130°C, as an upper limit. 

4.1 SMA Actuators Control Bench Test Validation 

The morphing wing system in the bench test runs is shown in Fig. 13. The gas springs 
that maintain the SMA wires in tension had a preloaded value of 225 lbs (1000 N) 
since in the laboratory condition there is no aerodynamic force. 




Fig. 13. Morphing wing system in the bench test runs. 
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After an initial calibration test the calibration gains and constants were established 
for the two LVDT potentiometers and for the six thermocouples. The calibration test 
for LVDT potentiometers consisted of several scans of airfoil using a laser beam. On 
the calibration, the SMA actuators were in "zero setting position" with no power 
supplied and the skin coordinates were measured using the laser beam that scanned 
the center line of the wing model. The laser was set to scan the chord of the model on 
a 370 mm length with a speed of 5 mm per second [21]. 

In the bench test, the 35 optimized airfoil cases were converted into SMA actuator 
#1 and #2 vertical displacements which were used as inputs reference for the control- 
ler. A typical test run history is shown in Fig. 14 for a=l°, Mach=0.3 flight condition 
(d7i=5.22 mm, d7 2 =7.54 mm - vertical displacements of the skin in the actuation 
points). 
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Fig. 14. Bench test for a=l°, M=03 flight condition. 



On observe that the controllers, in the two actuation lines, work even in zero values 
of the desired signal because of the gas springs pretension. Also, small oscillations of 
the obtained deflection are observed around the desired values of the deflections. The 
amplitude of the oscillations in this phase is due to the LVDT potentiometers me- 
chanical link and to the inertia of the SMA wires, being smallest than 0.05 mm. The 
heating phase is approximately 9 times more rapidly than the cooling phase; heating 
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time equals 8 s while the cooling time equals 70 s. There can be observed the differ- 
ences between the numerical model of the SMA actuators and the physical model. 

The bench test results confirmed that the experimental version of the designed in- 
tegrated controller woks well even in the lab conditions, where no aerodynamic forces 
are loaded and the preloaded gas springs force is 1000N. 

4.2 SMA Actuators Control Validation in Wind Tunnel Tests 

Once confirmed the well working of the designed integrated controller through bench 
test, the next step in our morphing wing project was to validate the controller in a 
wind tunnel test simultaneously with the transition point real time detection and visu- 
alization for all 35 optimized airfoils. The model was tested for all 35 theoretical 
studied flight conditions, a comparative study being realized around of the transition 
point position for the reference airfoil and for each optimized airfoil. So, simultane- 
ously with the controller testing, a validation study for the aerodynamic part of the 
project was realized [21]. 

The morphing wing system in the wind tunnel runs is shown in Fig. 15. 




Fig. 15. Wind tunnel morphing wing model. 



The transition detection was made real time using the pressure data obtained from 
the 32 Kulite and optical pressure sensors. The pressure data acquisition was per- 
formed using the IAR-NRC analog data acquisition system which was connected to 
the 32 sensors. The sampling rate of each channel was 15 kS/s, which allowed a 
boundary layer pressure fluctuations FFT spectral decomposition up to 7.5 kHz for all 
channels. The signal was processed by use of Simulink and visualized in real time on 
the computer screen in dedicated windows. 

The pressure signals were analyzed through Fast Fourier Transforms (FFT) de- 
composition in order to detect the magnitude of the noise in the surface air flow. Sub- 
sequently, the data is filtered by means of high-pass filters and processed by calculat- 
ing the Root Mean Square (RMS) of the signal in order to obtain a plot diagram of the 
noise in the air flow. This signal processing is necessary to disparate the inherent 
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electronically induced noise, by the Tollmien-Schlichting that are responsible for 
triggering transition from laminar flow to turbulent flow. The measurements showed 
that in processed data the transition appeared at frequencies between 3kHz - 5kHz and 
the magnitude of pressure variations in the laminar flow boundary layer are of the 
order 5e-4 Pa (7.25e-8 psi). The transition between laminar flow and turbulent flow 
was shown by an increase of the pressure variations, reflected also by a strong varia- 
tion of the pressure signal RMS. 

For the wind tunnel test the preloaded forces of the gas springs were reconsidered 
to the 1500 N because of the presence of the aerodynamic forces on the flexible skin 
of the wing. In Fig. 16 are presented the control results for test run a=2°, Mach=0.225 
(dy 1= 5.56 mm, dF 2 =7.91 mm). 
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Fig. 16. Wind tunnel test for a=2°, M=0.225 flight condition. 



The experimental results show a decrease of the SMA wires work temperatures 
vis-a-vis of numerically simulated and bench tested cases. An explanation can be the 
appearing of the aerodynamic forces with particular values for each flight condition. 
The decrease of these temperatures is a beneficial one taking into account the negative 
impact of a strong thermal field on the other component of the system, especially on 
the flexible skin and on the pressure sensors. Also, from the experimental results a 
high frequency noise influencing the LVDT sensors and the thermocouples instru- 
mentation amplifiers can be observed. The noise sources are the wind tunnel 
vibrations and instrumentation electrical fields. With this noise, the amplitude of the 
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actuation error (difference between the realized deflections and desired deflections) is 
under 0.07 mm, but this don't affecting the transition, which is stable on a sensor with 
a high RMS spike; Fig. 17 presents the results obtained on the transition monitoring 
for the run test in Fig. 16. 

The actuation line control obtained precision can have some influence in the transi- 
tion point position detection only if the density of the chord disposed pressure sensors 
becomes bigger; from the experimental data evaluation one concluded that, even the 
value of the error is 1 mm around the optimized values, the transition point position 
on the airfoil surface is not significantly changed. 

In Fig. 17 are presented the outputs of the Kulite pressure sensors in leading edge - 
trailing edge sense of positioning (3 sensors are broken and was not considered in the 
monitoring phase) and the real time pressure signals RMS for each of these sensors. 
The left hand column presents the results for reference airfoil, and the right hand 
column the results for optimized airfoil. The spike of the RMS suggests that we have 
turbulence on the sensor no. 13, near the trailing edge. 
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Fig. 17. Results obtained on the transition monitoring for the run test in Fig. 16. 
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So, the results obtained for the actuators control are very good, the controller fully 
satisfying the requirements imposed for the project purpose achievement. 

5 Wind Tunnel Experimental Validation of the SMA Actuators 
Control for Closed Loop Architecture of the Morphing System 

The next step of the work on the project supposed the development of the closed loop 
control, based on the pressure information received from the sensors and on the tran- 
sition point position estimation. Evidently, the closed loop control included, as an 
internal loop, the actuation lines previous presented controller [22]-[24]. 

The pressure data acquisition was performed using a NI-DAQ USB 6210 (Fig. 18) 
card with 16 analog inputs (16 bits, and a total sampling rate of 250 kS/s), 4 digital 
inputs and 4 digital outputs. 
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Fig. 18. The closed lop architecture of the morphing wing. 



The basic idea of this developed closed loop is to by-pass the necessity of a 
previously calculated optimized airfoils database, and to generate the optimized 
airfoils in real time starting from the pressure information received from the sensors 
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and from the morphing wing main goal (the improvement of the laminar flow over the 
wing upper surface). In this way, a mixed optimization method was used, between 
„the gradient ascent" or „hill climbing" method and the „simulated annealing" 
method. Two variants were tested for the starting point on the optimization map 
control: 1) dYi=4 mm, dY 2 =4 mm (Fig. 19), and 2) dFi opt , d7 2 o P t of the theoretically 
obtained optimized airfoil [22] -[24]. 

In order to obtain the closed loop morphing wing system in this architecture, the 
software application was developed in Matlab/Simulink and two National Instruments 
Data Acquisition Cards were used: NI-DAQ USB 6210 and NI-DAQ USB 6229 
(Quanser Q8 data acquisition card was removed from this configuration). Also, to 
realize the control of the morphing wing in closed loop, only the signals from the 
Kulite pressure sensors were used. In the beginning of wind-tunnel tests, a number of 
sixteen Kulite sensors were installed, but due to their removal and re-installation 
during the next two wind tunnel tests, four of them were found defective, therefore a 
number of twelve sensors remained to be used during the last third wind tunnel tests. 
So, only twelve Kulite sensors were used for plotting the RMS [22]-[24]. 
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Fig. 19. Optimization logic scheme. 



The input channels of the NI-DAQ USB 6210 data acquisition card were 
connected directly to the IAR-NRC analog data acquisition system, which was 
connected to the 12 Kulite sensors. The IAR-NRC served as an amplifier and 
conditioner of the signal at a sampling rate of 15 kS/s. One extra channel was used for 
the wind-tunnel dynamic pressure q acquisition to calculate the pressure coefficients 
C p from the pressure values measured by the 12 pressure sensors. The signal was 
acquisitioned at a sampling rate of 10 kS/s in frames of 1024 points for each channel, 
which allowed a boundary-layer pressure fluctuation fast Fourier transform spectral 
decomposition up to 5 kHz for all channels, at a rate of 9.77 kS/s using 
Matlab/Simulink software. The plot results were visualized in real time on the 
computer screen [22] -[24]. 

The transition between laminar and turbulent flow was detected, as we already 
said, by means of each pressure signal's RMS. For example, for incidence angle of 0°, 
and Mach number 0.3 flow condition, the FFT of the pressure signals resulted on the 
form in Fig. 20. The results in the left side of the figure were obtained for the un- 
morphed wing, while the results in the right side were obtained for the morphed wing 
in closed loop on the base of the pressure information received from the sensors. For 
morphed wing, a detachment of the pressure signals on the sensor number 10 
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(corresponding to sensor number 13 in the previous configuration, presented at the 
point 4) was observed, with a peak around of 3 kHz frequency. 

To visualize in real time the performances of the morphed wing, a Graphical User 
Interface (GUI), in which all the aerodynamical and morphing shape information 
were centralized together with the control buttons of the controlling software, was 
buildt (Fig. 21). The window shows information about the Mach number, the angle of 
attack, the airfoil shape of the morphing wing, and the two actuators vertical 
displacements needed to obtain the desired airfoil shape. In the two sides of the Fig. 
20 the GUI shows the coefficients pressure distribution Q?'s of the twelve Kulite 
sensors, and the noise of the signal (RMS) of each pressure signal, for the wing un- 
morphed position (left side), and for the wing under its morphed position (right side), 
for a=0°, M=0.3. The lowest RMS plot given in Fig. 21 shows the quantity of the 
pressure signal noise from each Kulite sensor (red star curve). The RMS red plot in 
the un-morphed configuration (left side of the Fig. 21) does not show any transition 
due to the fact that all twelve sensors show the turbulent flow [22] -[24]. 

As we can see from the „Control SMA" block scheme (Fig. 22), the two SMA 
actuators had six wires each, which were supplied with power by the two AMREL 
SPS power supplies, controlled through analog signals by the NI-DAQ USB 6229 
data acquisition card. The NI-DAQ, wich have 32 analog inputs (16-bit, 250 kS/s), 4 
analog outputs (16-bit, 833 kS/s), 48 digital I/O (32 at up to 1MHz), and 32-bit count- 
ers, was connected to the control laptop through an USB connection. The control 
program implemented in Simulink, used as feedback three temperature signals 
coming from three thermocouples installed on each wire of the SMA actuator, and a 
position signal from a LVDT sensor connected to the oblique cam sliding rod of each 
actuator. The temperature signals served in the overheat protection system that 
disconnected the current supply to the SMA in case of wire temperature passed over 
the set limit of 120°C. The position signals served as feedback for the actuator desired 
position control. Also, the controller activates the power supplies with the needed 
SMA current values through an analog signal. The control signal of 2 V corresponded 
to a SMA supplied current of 33 A [22] -[24]. 




Fig. 20. Pressure signals FFT for un-morphed and morphed wing, for a=0°, M=0.3. 
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Fig. 21. GUI for un-morphed and morphed wing, for a=0°, M=03. 
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Fig. 22. Control SMA" block scheme. 



This new conceived Simulink scheme for closed loop allows users to choose be- 
tween the manual deflections setting dY u dY 2 (open loop), and automatic control of 
the morphing wing (closed loop). Also, the software allows users to make different 
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functioning tests on the morphing wing system components (for example, on the 
power supply). 

Our designed SMA controller was embedded in the "PI + On-Off ' blocks for first 
and second actuators, and was used, as an internal loop, for the closed loop control of 
the morphing wing system. The closed loop results presented in Fig. 23 for a=0°, 
M=03 flow case, confirmed once again that the designed SMA actuators controller 
worked very well. Also, the results validated the theoretical optimized airfoil obtained 
by Ecole Polytechnique in Montreal for this case, taking into account that optimiza- 
tion method implemented in the closed loop conducted to a morphed airfoil almost 
identical with the firs one, and the transition was detected on the same pressure sensor 
with the open loop case. 
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Fig. 23. Wind tunnel test for a=0°, M=03 flow condition. 



6 Conclusions 



The objective of here presented research work is to develop an actuation control con- 
cept for a new morphing mechanism using smart materials, like Shape Memory Alloy 
(SMA), as actuators. These smart actuators modify the upper surface of a wing made 
of a flexible skin so the laminar to turbulent transition point moves close to the wing 
airfoil trailing edge. 

The designed controller must controls the SMA actuators in terms of supply elec- 
trical current so that to cancel the deviation between the required values for vertical 
displacements (corresponding to the optimized airfoils) and the real values, obtained 
from position transducers. The envelope of the SMA actuator in Fig. 5, obtained 
through numerical simulation using model in Fig. 4 for different aerodynamic load 
cases, confirms that the length of the SMA wires is a complex function of the SMA 
load force and temperature, the last one being influenced by the supplying current in 
time and by the interaction of the wires with the environment in theirs cooling phase 
(when the electrical supply is removed). 

As can be observed from Fig. 5, to obtain a skin maximum vertical displacement 
(8 mm) in absence of aerodynamic force, it is required a high temperature (approxi- 
mately 162°C) in order to counteract the spring force. Because the ability of the SMA 
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wires to contract is dependent upon Joule heating to produce the transformation tem- 
perature required, the need in higher temperature is reflected by a need in higher elec- 
trical current. Due to the fact that the aerodynamic forces reduce the actuators load the 
required current and temperature values are decreased; i.e. for F aero = 1800 N the need 
in temperature for the maximum vertical displacement obtaining is approximately 
90°C. 

The final configuration of the integrated controller was a combination of a bi- 
positional controller (particularly an on-off one) and a PI (proportional-integral) con- 
troller, due to the two phases (heating and cooling) of the SMA wires interconnection. 
The resulted controller must behave like a switch between cooling phase and heating 
phase, situations where the output current is A, or is controlled by a law of PI type. 

Using an integral criterion, the error minimum surface criterion (Ziegler-Nichols), 
the PI controller for the heating phase was optimal tuned, the resulted values are 
K P = 1792.8 and £/=787.0061. Evaluating the systems' performances one observed 
that the poles of closed loop transfer function of the controlled heating phase resulted 
with the values (14) are all placed in the left-hand side of the s-plane, so the obtained 
system is stable. On the other way, the system was found to be completely controlla- 
ble and observable based on the values established in equations (17)-r(19). So, the 
final form of the integrated controller law was (20). 

Loading the numerically simulated general model (the non-linear one with F preten . 
5ion =1500N) in Fig. 8 with aerodynamic force F aero = 1500N, the obtained characteris- 
tics in Fig. 10 confirm that the controller works good, the transition to the desired 
steady-state being significantly improved through the integration of the two control 
law in the equation (20): 1) the amplitudes of oscillations were reduced and the ob- 
served oscillations in the SMA temperatures around the steady-state are due only to 
the thermal inertia of the smart material; 2) the values of the transition time from 
0mm to the steady-state values decrease from 20-^25 to approximate 5 s. 

As second and third validation methods a bench test and a wind tunnel test were 
performed. 

In the bench test phase, the 35 optimized airfoil cases were converted into SMA 
actuator #1 and #2 vertical displacements which were used as inputs reference for the 
controller. The characteristics in Fig. 14 (a=l°, Mach=0.3 flight condition) show that 
the controllers, in the two actuation lines, work even in zero values of the desired 
signal because of the gas springs pretension. Also, small oscillations of the obtained 
deflection are observed around the desired values of the deflections. The amplitude of 
the oscillations in this phase is due to the LVDT potentiometers mechanical link and 
to the inertia of the SMA wires, being smallest than 0.05 mm. The heating phase is 
approximately 9 times more rapidly than the cooling phase; heating time equals 8 s 
while the cooling time equals 70 s. 

For the final experimental validation test (wind tunnel test), with real aerodynamic 
forces load, the 1500 N preloaded forces of the gas springs was reconsidered. From 
Fig. 16 (a=2°, Mach=0.225) a decrease of the SMA wires work temperatures vis-a-vis 
of numerically simulated and bench tested cases is observed. The decrease of these 
temperatures is a beneficial one taking into account the negative impact of a strong 
thermal field on the other component of the system, especially on the flexible skin and 
on the pressure sensors. Also, a high frequency noise influencing the LVDT sensors 
and the thermocouples instrumentation amplifiers can be observed. The noise sources 
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are the wind tunnel vibrations and instrumentation electrical fields. With this noise, 
the amplitude of the actuation error (difference between the realized deflections and 
desired deflections) is under 0.07 mm, but this doesn't affecting the transition, which 
is stable on a sensor with a high RMS spike like in Fig. 17 and Fig. 21 (for closed 
loop). 

So, the results obtained for the actuators control are very good, the controller fully 
satisfying the requirements imposed for the project purpose achievement. 
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Abstract. This paper presents the development and implementation of energy- 
efficient parameter adaptation for a grey-box model representing the tempera- 
ture profile in spatial points of the interior of a refrigerated container with the 
aim to improving the logistics of perishable goods. A mixed linear / non-linear 
singe-input- single-output grey-box model was selected for accurate prediction 
of the temperature behavior of the loaded food products. The algorithms were 
specially modified to reduce the matrix dimensions, implemented in Matlab, 
and applied to experimental data for validation. Apart from being highly accu- 
rate, the predictions comply with the desired figures of merit for the implemen- 
tation in wireless sensor nodes, such as high robustness against quantization and 
environmental noise. The OSGi framework, which allows for easy update of 
software bundles, was selected as basis of the software implementation on the 
iMote2 as sensor network platform. Performance measurements have shown 
that this method provides a fast and accurate prediction with high energy 
efficiency. 

Keywords: System identification, Temperature, Organic heat, Feedback- 
hammerstein, OSGI, Java. 



1 Introduction 

Research has been done in the past to estimate the temperature profile inside refrige- 
rated containers. Several options have been investigated: mathematical approaches as 
presented in [1], K-s models as proposed in [2], and several numerical models as re- 
viewed in [3]. With the exception of [4], in which the effect of the pallets is consi- 
dered; usually the focus is put on the cold air flow as the main factor governing the 
temperature pattern inside a container and the effects due the cargo presence is sub 
estimated. 

Babazadeh [5] suggested an approach that takes the effect of the cargo to the tem- 
perature into account. He proposed the use of wireless sensor nodes (WSN) to meas- 
ure the ambient parameters in the surroundings of a spatial point of interest and 
the use of system identification to estimate the parameters of a linear Multi-Input 

J.A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 77— pp|. 
springerlink.com © Springer- Verlag Berlin Heidelberg 201 1 



78 



J. Palafox-Albarran, R. Jedermann, and W. Lang 



Single-Output (MISO) system and concluded that in order to have a good estimation, 
it is necessary to have a high number of training samples and many inputs to the 
system. 

In this paper an alternative Single-Input Single-Output (SISO) grey-box model is 
presented to predict the temperature inside the container under the presence of perish- 
able goods with the aim of reducing the complexity and preserving the accuracy. The 
proposed model provides a meaningful description of the factors involved in the phys- 
ical system including the effect of transporting living goods such as fruits and vegeta- 
bles. The starting point is based on the physical relations; subsequently, a tuning 
parameter for the specific case of bananas is found by simulations. 

2 Model of the System 



The factors affecting the temperature distribution inside a refrigerated container are il- 
lustrated in Figure 1 . The cold air flows from bottom to top through the gratings in the 
floor and through the spaces between the pallets, and eventually the air is drawn off 
the channel between the pallets and the container ceiling. 

A naive representation of the container can be done by a SISO linear dynamic sys- 
tem in which the input is the air supply and the output is the spatial point of interest. 
However, in reality this is only a simple model of the main contributor to the tempera- 
ture pattern, the air flow. Several other factors affect the speed of the cooling down. 

To improve the accuracy of the model, other contributors are considered as well: 
first is the heat, produced by respiration of living goods such as fruits and vegetables; 
second is the thermal loss, affecting the correct cooling of the goods; finally, unpre- 
dictable temperature variations due to highly changing external climatic conditions 
during transportation. 



Disturbances 



Isolation losses 




Organic heat 
Fig. 1. Factors affecting the temperature inside a refrigerated container. 



The linear SISO black-box model which represents the air flow is represented 
mathematically by a linear dynamic system H, in the discrete domain, given by the 
Equation 1. 
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Atq- 1 ) 



(1) 



Where n a and n b are the orders of the system polynomials, b ± ... b^, a x ... a na are the 
polynomial coefficients, and q is the delay operator in discrete domain. 

An attenuator, a ,models the isolation loses of the air supply temperature and is 
modeled to affect the input of the dynamic system. The external climatic conditions 
are unknown in advance, therefore considered a statistical process. The output of the 
Moving Average (MA) process, which is in fact white noise (WN) filtered by the fil- 
ter C represented in Equation 2 added to the output of the dynamic system, models 
them. 



C(q !) = 1 + Cl q x + ... + c n jf 



(2) 



To model the organic heat, it is necessary to use experimental data. Figure 2 [6] 
shows a family of curves for organic heat in the case of bananas. A proportional 
relationship between of the organic heat and the rippening state is observed. 



Heat 
W/Ton 

A 



300 



200" 



1000 - ■ 



Yellow 




Green 



Temperature C 
Fig. 2. Heat Production of bananas. 



Equation 3 represents the organic heat relation with respect to the temperature. Pf ru t t 
is the heat production in Watts, y is a constant which is fixed for a certain type of fruit 
and rippening- state in 1/°C, T is the fruit temperature in °C, and /? is a scaling factor 
which depends of the amount of food and is given in kilograms. 



"fruit 



= Pe 



yt 



(3) 



Finally, the block diagram to represent the input-output relations of all the factors is 
built. It is shown in Figure 3. The air flow dynamics are represented as a feed-forward 
block as it is the most important contributor. The isolation losses affect the correct 
cooling of the goods before the dynamic system and the noise effect has an additive 
effect on the output. 

The contribution of the organic heat depends on the cooling temperature inside the 
container. Simultaneously, it has a small additive effect in the input of the linear 
dynamic system as the air flows through the pallets and is slightly warmed. It is 
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represented by a static exponential feedback. The resulting block diagram, in which a 
linear dynamic system has a non-linear feedback corresponds to a Feedback- 
Hammerstein (FH) configuration [7]. 



u ► 



a -0 




Fig. 3. Model of the system. 



3 Parameter Adaptation Algorithm 



In [7] a Parameter Adaptation Algorithm (PA A) was developed to identify the 
parameter-set of a FH system. It uses an intermediate variable y(t) and converts the 
non-linear system into a pseudo-linear one. Its principal advantage is that the 
conventional recursive matrix-based linear system identification algorithms as those 
presented in [9] can be applied to estimate the parameter matrix □. The recursive 
form of those algorithm is given by Equation 4. Where e(t) is the prediction error as 
described in Equation 5, P(t+1) is an adaptation matrix to perform the minimization 
of 8 using Recursive Least Squares method, and cp(t) is the observation matrix that 
contains the input and the output data. A(t + 1) in Equation 6 is the so called 
Forgetting Factor (FF). 



0(t+l)=0(t)+(P(t+ l)cp(t)) 7 £(t) 



(4) 



£ (t) = y(t) - 0(t) r cp(t-l) 



(5) 



p(t)-p(t)(p(p T ( T p(t \ — -) 
^ ' k(t+i) 



(6) 



X{t + 1) = X * A,(t) + 1-X 



(7) 



Guo [7]considers the non-linearity as a polynomial of order / as shown in Equation 8; 
however, the dimensions of the matrices in the algorithm would be significantly too 
large to be applied in platforms where power consumption is an important figure of 
merit. 



i?(y(0) = ZUoM*y k (t) 



(8) 



In order to reduce the dimensions of the matrices, it was proposed to use the exponen- 
tial term in Equation 3 instead, y is to be determined and it remains constant, while p 
is a parameter to be identified as it depends on the amount of fruit being transported. 
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The linear term of the Equation 8 needs to be extracted to be included in the 
polynomial i4*(q -1 ) of the equivalent SISO pseudo-linear system. Expanding it into a 
Taylor series and rearranging, the summation of the non-linear coefficients of the 
exponential function can be calculated using Equation 9. The non-linear coefficients 
and an offset are on the left hand of the equation. 



z -_ 2 02g»! +1 = e ryM _„,(,) 



(9) 



The equivalent pseudo-linear system for an exponential non-linearity is shown in 
Equation 10. 

^(^- 1 )y(t)= hau(t) +M* yy(t) - b^yit) + B -^f±m + CCerXO (10) 

The resulting coeficients of the polynomials i4*(c/ -1 ) and £*(q _1 ) are given by 
Equation 11 and 12. 



al = a k~ (Py)h 



(11) 



-n b 



B*{q- 1 ) = b 2 q- 2 + ... + b nb q 
And the intermediate variable is shown by Equation 13. 

Kt) = fc 1 [au(t)+jff(e^ t )-yy(t))] 



(12) 



(13) 



The choice of the forgetting factor in the algorithm is often critical. In theory, it must 
be one that converges. On the other hand, if it is less than one the algorithm becomes 
more sensitive and the estimated parameter changes quickly making the convergence 
faster. A more complex solution is to allow it to vary with time, lower than one at the 
beginning but tending to one. 
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0.995 



0.99 



Types of Forgetting Factors 
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Variable Forgetting Factor 



100 200 300 400 500 600 700 800 900 
Iteration 



Fig. 4. Types of forgetting factors. 
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Figure 4 illustrates three different types of FF. The first case is obtained by mak- 
ing Z , and A(t) in Equation 7 equal to one. It is called Decreasing Gain (DG). In the 
second case, the Constant Forgetting Factor (CFF) A(t) is set to a value smaller than 
one and A set to one. Finally, the Variable Forgetting Factor (VFF) uses a value of 
X smaller than one and recalculates A(t) for each iteration. 



4 Prediction Algorithm 

The predictions are made using the estimated parameters in the model. Figure 5 
shows experimental data sets from a container transporting bananas. It can be 
observed how the air supply is kept constant after some days. For the prediction 
algorithm, u(t) is set to the value of the last sampled input temperature of the 
parameter adaptation process. Similarly, the initial predicted output value is set to the 
last acquired value of the output. Equation 14 to 17 describes the prediction 
algorithm, m is the number of iterations used for the PAA. 



y P red<jn) =y(rn) 

y P red(t)= @ 7 (m) (p pred (t-1) 

y P red(t) = b^auim) + p(eV y vre*V -yy pred (t))] 



(14) 
(15) 
(16) 
(17) 



Table 1. Elements of the elements in the algorithm matrices. 



Symbol 


Arrangement of the elements into the matrices 


<p(t) 


[~y(t) • • • - y (t - n a + 1), u(f - 1), (en® 

- ryit)),y{t - l)...y(t-n b ),e n (t)---e n (t - n c + 1) ) ] 


® r (t) 
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Ved (t) 


[-ypred(t) ••• - y V red(t ~ ™a + l),u(m), (e yy P™^ 

~ yypred W)i y P red (f ~ !)■ ■ ■ fpred (fi-n b J) ] 


© T (m) 




al...a* na , bl a,(]bJy bi .. K »/ bi 





5 Determination of y 

Considering the linear dynamic system H as the most important contributor to the 
temperature profile, an exponential discrete time decaying system like the one pre- 
sented in Figure 5 can be described as of the order of one with its unique pole on the 
real positive axis. The closer the pole to one the higher the delay of the system. 

The selection of y is a key factor in the precision of the algorithms. To find a 
trustworthy y parameter that characterizes the respiration heat of bananas. The 
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presented Feedback-Hammerstein model of linear order one and the FH parameter 
adaptation and prediction algorithms are run using given experimental data sets. The 
Mean Squared Error (MSE) of the prediction over n samples, equivalent to fifteen days, 
is stored for several values of y and fixed number of training days. If the stored values of 
the MSE are plotted, the local minimums are determined by the observation of the MSE 
vs. y curves. In Figure 6, it can be seen that in the above mentioned plot for five days of 
training and for the data set 1, a local minimum exists for a value y of 0.0587. 




5 10 

time [days] 



Fig. 5. Banana data sets. 



n 
MSE = -^(Vrealit) ~ Vpredit)) 2 



(18) 



MSE of FH ARX PAA with respect to gamma 




0.05 0.1 

Values of y 
Fig. 6. Prediction accuracy vs. y. 
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6 Results 



Several figures of merit are considered for validation of the model and algorithms. 
The accuracy and the speed of convergence are of paramount importance; however, 
quantization and noise robustness are also highly desirable for implementation in a 
WSN. Only the linear orders of one and two are considered to avoid computation of 
complex conjugate poles that would characterize oscillations. 

To observe the speed of convergence and the accuracy of the predictions with re- 
spect to the number of training days, parameter estimation and a prediction in Matrix 
form are done (See Table 1) for a fixed number of training days. Subsequently, MSE 
vs. Training days graphs are plotted. Assuming a quantization level of 0.2°C, a Mat- 
lab script was written to assign the nearest value of the quantization grid to the input 
and the output datasets. The results of the predictions using the quantized datasets are 
overlapped with the results of non-quantized. 

Similarly, to determine the noise robustness, MSE versus the signal to noise ratio 
(SNR) is plotted. Several noise levels of white noise were added to the output of the 
data set 1, and the resulting signals were applied to PAA and prediction algorithms 
with fixed number of training days. 



SNR(dB) =1 01og(^) 



(19) 



Table 2. Summary of simulation results. 





Accuracy 


Number of 

matrix 

elements 


Convergence 
speed 


Quantization 
Robustness 


Critical 
SNR 


Estimation 

for linear 

data 


Best 

Forgetting 

Factor 


Best 
Linear 
order 


ARX 


CFF 


2 


3 


Bad 


Good 


47dB 


Good 


ARMAX 


CFF 


2 


3 + n c 


Bad 


Bad 


43 dB 


FHand 
WN model 


DG 


1 


3 


Good 


Good 


43 dB 


Bad 


FHand 
MA model 


DG 


1 


3 + n c 


Good 


Bad 


43 dB 



Simulations were done for two types of data sets. First, the experimental data of 
bananas were used to include the presence of organic heat. Secondly, the data sets 
corresponding to a cheese experiment, which does not present organic heat, were con- 
sidered. A summary of all simulation results is presented on Table 2. 



6.1 FH vs. Linear Models in the Presence of Organic Heat 

From the simulations it is observed in Figures 7(a) and 7(b) that if linear methods are 
applied to the banana datasets, the accuracy of the results for different sensor 
positions of are not sufficient. Quantization robustness is improved with the linear 
order of one and the speed of convergence is better using CFF. Even in the best of 
cases acceptable prediction accuracy can only be achieved after more than five days 
of training. 
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Fig. 7. ARX of order one in the presence of organic heat. 

It is also observed in Figures 7(b) and 7(c) that FH identification algorithms are the 
best to achieve fast convergence speeds. In the best cases, less than 3 days of training 
is sufficient to achieve good predictions. However, the plots are made for the data 
from three days onwards to avoid the visualization of the effects in MSE due to the 
set point variations in the reefer supply temperature. Linear system orders of one are 
in all cases better than order of two, both in the speed of convergence and the quanti- 
zation robustness. Decreasing Gain must be optimal to preserve the accuracy and the 
quantization robustness. 

Concerning the noise models, results of the simulation of Feedback-Hammerstein 
with MA process are worse than when modeled as white noise (WN). It affects the 
accuracy and the quantization robustness. 

6.2 FH vs. Linear Models in the Absence of Organic Heat 

In the case of cheese data set, the linear methods accuracy results are better than that 
of the Feedback-Hammerstein as can be observed in Figure 8. Modeling noise as 
white gives better quantization robustness than modeling it as MA process. 



Linear ARX of order of one 
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- Quantized 
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Trainining Days 
FH of linear order of one 
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(a) 



Linear ARX of order of two 



- Non-quantized 

- Quantized 
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Linear ARMAX of order of two 



- Non-quantized 

- Quantized 



4.5 5 5.5 6 6.5 7 7.5 

Time (Days) 



(b) 



Fig. 8. Comparison of FH and linear methods in the absence of organic heat. 
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The use of forgetting factors does not have a big impact in the results of ARX 
predictions; however, Constant Forgetting Factor is slightly better for ARM AX 
predictions. Linear orders do not affect the simulated predictions, but an order of two 
is selected because it can model more accurately if the behavior of the system is not 
purely decaying. 

6.3 Noise Robustness 

The noise was added to validate FH and linear models; also for both of them the 
accuracy is compared with and without the MA model. Maximum Signal-to-Noise 
Ratio to obtain a good prediction is observed to be around 43 dB for all of them with 
the exception of ARX which has a maximum value of 47 Decibels as shown in 
Table 2. 

FH ARX Prediction Error vs SNR 




- Non-quantized 

- Quantized 



Signal to Noise Ratio(dB) 



Fig. 9. Noise Robustness for FH method. 



7 Prediction Improvement 



The described approach was originally developed based on an experiment in 2008 
with records for 3 sensors (data set A). Two new data sets with 16 sensors each, 
which were recorded in 2009 [9] in two separate containers (data set B and C), were 
used to cross validate the approach. 

FH algorithm of linear order of one was applied to all data sets; neither quantiza- 
tion nor forgetting factor is used. For the initial parameter settings, the pole and zero 
of the feed-forward linear system was set to 0.9 and 0.0; p was set to 2. 

The previously obtained value of y equal to 0.0587 is used to predict the tempera- 
ture inside the containers for many spatial positions. The results are compared to the 
predictions for the datasets shown in Figure 5 and resumed in Table 3. A good aver- 
age is observed for the three containers; however, in some positions the predictions 
are not as accurate as is observed in the Maximum column. 

A second approach is to select y according to the position of the pallets inside the 
container. The method to find y, described previously, is applied to all the new con- 
tainer datasets. 
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It is observed that an improvement in the accuracy of the predictions can be made 
if two different values of y are selected: one for pallets close to the door-end, and one 
for pallets close to the reefer supply. In Table 3(b) it is resumed the prediction results 
if values of 0.0525 and 0.055 are set respectively. 

Table 3. MSE prediction results. 





MSE prediction results for a unique 
value of y 


MSE prediction results for values 
of y according to the position in- 
side the container 


Container/Result 


Maxi- 
mum 


Mini- 
mum 


Average 


Maxi- 
mum 


Mini- 
mum 


Average 


Data set A 


0.1893 


0.0173 


0.0778 


0.1893 


0.0173 


0.0778 


Data set B 


1.4558 


0.0550 


0.4130 


0.4767 


0.0279 


0.0946 


Data set C 


0.8888 


0.0101 


0.2798 


0.5747 


0.0201 


0.1743 



8 Software Bundle Implementation and Energy Consumption 
Measurements 



In a container scenario, energy consumption turns out to be the most limiting factor, 
and therefore, a priority consideration. Furthermore, the system should be able to in- 
stall the tuned algorithm according to spatial positions and/or update it according to 
the new knowledge obtained from experimental results. 

The chosen hardware platform is Imote2 [11]. At the core of it is a PXA271 Intel 
processor, integrated with volatile and non-volatile memory, a power management IC 
to go to deep-sleep mode, a transceiver, and an antenna. Furthermore, it allows stack 
ability of additional modules to interconnect additional devices, such as, temperature 
sensor cards. 

Linux operating system is installed, and on it the Java based OSGi [12] framework 
(formerly Open Source Gateway Initiative) to enable features such as dynamic soft- 
ware updates. OSGi can update and install the so-called software bundles during run- 
time without interrupting the execution of the remainder of the system. 

For the evaluation of power consumption of the iMote2 the supply current was 
measured over a 1 Ohm resistor in series with the power supply wire, with a test 
probe connected in parallel to it. Additional hardware modules were detached one by 
one to measure their individual power consumption. Imote is programmed to run infi- 
nitely with the Feedback-Hammerstein training algorithm and the radio is powered at 
the end of each iteration to observe the last period of training. Regarding the predic- 
tion algorithm, it is considered that the iMote2 would only perform the training 
process locally and send the obtained parameters to a remote server. The server calcu- 
lates the model prediction. But because the server has only very relaxed energy- 
constraints compared to the iMote2, an energy analysis is not required. 
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The amount of Work consumed by the algorithms is calculated by equation 20. 

W = P*t (20) 

The required CPU time for the algorithm to perform one parameter adaptation iterati- 
on was measured. Measurements showed that each iteration takes aproximately 5 ms 
and consumes 0.9 mJ [13]; according to simulations, it is possible to predict the tem- 
perature after 3 days with a sampling rate of one hour that equals to 72 iterations the 
equivalent of only 64.8 mJ. 



Start 




Turn radio off 



Prediction 



Turn radio on 

i 



Fig. 10. Program flow diagram to measure FH training time. 



9 Conclusions 



A model to represent the factors affecting the temperature inside a refrigerated 
container transporting perishable goods was proposed. It models the effect of organic 
heat using a static non-linear feedback system, the refrigeration by a linear dynamic 
feed- forward system, and the disturbances by stochastic processes. This complex 
model can provide an accurate description of the factors involved in the physical 
system. 

The selected identification method was adapted specifically to reduce the 
dimensions of the matrices. The non-linear exponential function is used instead of a 
polynomial to preserve the simplicity of the parameter adaptation and the prediction 
algorithms. The disadvantage of the simplification is that depending on the kind of 
fruits to be transported, it is required to tune the algorithm by a correct selection of y 
which has to be known in advance. An improvement can be observed in the accuracy 
of the predictions if y is set according to the position of the pallets inside the container. 

Results concludes that the FH identification algorithm is efficient when the cargo 
emits organic heat. The method of FH of order 1 is optimal to achieve all figures of 
merit. It makes accurate predictions only after three days of training and maintains 
low dimensions of matrices. 

However, if the linear method is applied to the banana datasets, a comparable 
accuracy can only be achieved after more than five days of training. Also, results 
graphs evidence that when the goods to transport are free of organic heat, such as in 
the case of cheese, it is preferable to use a linear system instead. 
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Three days of Feedback-Hammerstein training, which is the minimum to achieve a 
good prediction, requires in total 64.8 mJ of energy on an iMote2 platform using Li- 
nux as Operating System and OSGi as software framework. The latter one allows dy- 
namic software updates and tuning of the algorithm according to the spatial position 
of a mote in a container or the installation of a linear parameter adaptation algorithm 
if the cargo does not produce organic heat. 
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Abstract. The performance of a network protocol depends highly on the config- 
uration of its parameter setting. Typical installations consider just a static setup 
that works well on average. The Organic Network Control (ONC) system pro- 
vides a possibility to adapt the configurations of network protocols to environ- 
mental changes dynamically at runtime. Such rapidly changing conditions can 
be found e.g. in mobile ad-hoc networks (MANets), since nodes that form the 
direct neighbourhood are moving and the sending distance is restricted. Thus, 
MANets are a promising application domain for the ONC system. This chapter 
describes how ONC is applied to the control of MANet protocols and demon- 
strates the positive effect caused by the on-line adaptation. Furthermore, a collab- 
oration mechanism is introduced that has been developed to alleviate the impact 
of some high-effort tasks of the system. Instead of fulfilling tasks on their own, 
neighboured nodes can share their knowledge and take over some responsibilities 
from each other. 



1 Introduction 

Data communication is characterised by a steadily increasing number of interconnected 
devices and the corresponding transfer load in networks - some of these already reach 
their limits. Hence, the currently used techniques (e.g. protocols and infrastructure) will 
not be able to cope with the demand in the near future |6|. Due to these upcoming de- 
mands, researchers started to develop new concepts (e.g. for the Internet [19]). Until 
they will be commonly used, the current problems have to be alleviated and progress 
towards new paradigms of networking is needed. One approach that optimises the per- 
formance of today's technologies is the Organic Network Control (ONC) system l24l . 
which adapts the - usually static - protocol configurations dynamically to environmen- 
tal changes. 

A demanding challenge for the ONC system is the control and adaptation of mobile 
ad-hoc network (MANet) protocols as they are used in highly dynamic environments. 
The possible movement of nodes leads to a continuous change of the situation: Neigh- 
bours are getting out of reach or joining the sending distance. This does not only lead to 
complex problems of how to configure the protocol, it also offers high potential for an 
improvement of the system performance. Within this chapter, we explain how the ONC 
system is applied to a MANet broadcast protocol and how the overall performance of 
the MANet system can be increased using ONC. 
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ONC combines the advantages of on-line learning with safety aspects by providing 
a sandbox-learning component to extend the behavioural repertoire of the system in 
cases of unforeseen situations. This safety concept and the restrictions applied to the 
learning component are characterised by drawbacks in terms of resource usage and 
the duration until a new rule is available. To alleviate this effect, a new collaboration 
mechanism has been developed that allows for knowledge exchange and dynamic load 
balancing between neighbouring nodes. After discussing the basic application of ONC 
to the control of MANets, this collaboration mechanism is explained followed by an 
analysis of its behaviour. 

This chapter is based on the initial concept as presented in l23l which demonstrates 
the application of ONC to MANet-based broadcast protocols. Compared to the pre- 
vious paper, the approach is extended by introducing a collaboration mechanism that 
reduces the ONC-implied overhead significantly. The chapter is organised as follows: 
Section 2 describes the related work and gives an overview of approaches that aim at 
adapting network protocols dynamically to changing environments. Afterwards, Sec. 3 
briefly presents the ONC system and its architectural design. Furthermore, the adapta- 
tions of the framework to allow for the control of MANet protocols are described and a 
short evaluation is used to outline the positive effect of ONC control in MANets. As ex- 
tension to the previous paper, Section 4 introduces a new collaboration mechanism that 
increases the learning performance of ONC. Finally, Sec. 5 summarises the presented 
system and names further research to be done. 

2 Related Work 

During the last decade, researchers developed several approaches to automatically adapt 
network protocols to some environmental conditions. In this context, IBM's Autonomic 
Computing (AC) initiative [ 8 ] has been the most prominent platform. New demands in 
networking and reaching the limits of existing networks brought up challenges that can 
be covered by adaptation and more dynamic solutions. Besides the ONC system as 
discussed in this chapter, different directions of research are known to cope with the 
problem: adaptive protocols, composition of protocol stacks, centralised solutions to 
adapt protocol configurations, or self-parametrising systems. 

2.1 Protocol Composition 

The most popular way of enabling adaptivity aspects for data communication proto- 
cols is to integrate them into the protocol logic. Prominent examples are the conges- 
tion avoidance in TCP 1 12] and the collision detection on the link layer (e.g. Ethernet- 
protocols 1 9]). Considering adaptivity aspects within the protocol logic has drawbacks: 
The logic has to be changed and a co-operation with standard non-adaptable protocols 
might not be possible (besides further aspects like limited re-usability, limitation to only 
one purpose, etc.). Consequently, a research domain called protocol stack composition 
emerged that provides a more generalised solution. Instead of interfering with the proto- 
col logic, it covers the upcoming tasks by exchanging protocols and stacks dynamically 
1 17]. Several different systems have been developed through the years - the most pop- 
ular representatives might be Appia lfT3lL Cactus Q, and Horus fl5l . In contrast to the 
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ONC system, all the named approaches do not adapt the protocols and their configu- 
rations, instead they select protocols with static settings on demand from a pre-defined 
set. Thus, the behavioural repertoire is limited to this fixed set of actions without any 
chance to extend it at runtime (e.g. in cases of unforeseen situations or disturbances). 
Furthermore, the usage of a centralised element which is responsible for observing the 
whole network and deciding about a global re-composition for the protocol stack is 
characterised by typical problems of centralised systems - for instance, an adaptation 
on a per-node basis is not possible. 



2.2 Centralised Protocol Adaptation 

Due to the constraints of protocol composition in terms of restricted rule-bases, systems 
have been developed that focus on adapting protocols by considering their variable pa- 
rameters at runtime - namely the work presented by Sudame and Badrinath 1211 , Ye et 
al. (26), and Georganopoulos and Lewis 0. The authors of the former approach dis- 
cussed a first TCP- and UDP-based study and defined the need of dynamic adaptation, 
but detailed examination and a demonstration of the re-usability for other protocols are 
currently not addressed. Ye et al. proposed a centralised system that tackles the opti- 
misation task for each node using a heuristic approach in combination with a simula- 
tion of the complete network at server-side. Georganopoulos and Lewis discussed their 
dynamic optimisation framework for the reconfiguration of network protocols. Based 
again on a central element, they mainly focus on cross-layer optimisation for the proto- 
col stack, but less on considering environmental conditions. Since all three approaches 
rely on a client-server concept, problems like e.g. bandwidth usage, single point of 
failure, or accessing local knowledge have to be covered in order to allow for such a 
division of work between a central server and the particular network nodes. 



2.3 Self -parametrising Systems 

Recently, part of the research split from the centralised approaches and presented self- 
organised systems such as ONC, since the drawbacks of centralised systems cannot be 
covered in settings like MANets. As one example, Su et al. |22] presented their ap- 
proach for a mobility-adaptive self-parametrisation of different unicast and multicast 
routing protocols. They calculate link expiration times based on locally measured in- 
formation (position, velocity, and moving direction) and periodic status messages of 
the neighbours with the additional need of a positioning system like GPS. In con- 
trast, Ahn et al. H) do not depend on the existence of a positioning system. Instead 
they use the observed number of changes in the 1-hop neighbourhood of the nodes as 
mobility metric. Similar concepts have been presented by, e.g., Stanze et al. |20] or 
Boleng 1 3 ]. All approaches named before are restricted to routing protocols in MANets. 
They are application- specific and cannot be transferred to enable self-parametrisation 
of protocols from other domains, since they focus on determining the mobility aspect 
in MANets and use pre-defined actions for observed situations. 
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Fig. 1. ONC's architecture. 



3 On-Line Adaptation of MANet Protocol Configurations 



3.1 Organic Network Control 

The Organic Network Control (ONC) system adapts network protocol configurations 
dynamically at runtime [24|. The system's architecture is founded on the generic Ob- 
server/Controller approach as presented by Richter et ah in [ 16] and is organised using 
three consecutive layers, see Fig. Q] The following part of this chapter describes the 
particular layers and introduces tasks that have to be fulfilled when applying ONC to 
a new protocol. The lowest layer (Layer 0) is used to encapsulate an existing network 
protocol instance, e.g. a broadcast algorithm for mobile ad-hoc networks (MANets) or 
a Peer-to-Peer protocol. In terms of Organic Computing [18], this controlled network 
protocol instance is the System under Observation and Control (SuOC). ONC provides 
a basic solution for controlling existing protocols without the need of knowing internals 
of the protocol or interfering with its logic. However, it is required that the parameters 
of the protocol can be altered on-line. Additionally, the current status of the protocol 
instance and its environment have to be observable and accessible. 

Layer 1 of the architecture adapts the SuOC dynamically to changes in the envi- 
ronment. It therefore consists of two basic components: an Observer and a Controller. 
The Observer is responsible for collecting status information about the network proto- 
col instance, its settings, and its environmental conditions. It aggregates the observed 
figures and augments them with optional further knowledge (prediction values, historic 
knowledge) and builds a vector describing the current situation at the node. This situa- 
tion vector serves as input to the Controller which has to fulfil two tasks: evaluate the 
system's performance within the last evaluation cycle and decide about the next action 
to be taken. Since ONC shall be able to self-optimise the quality of its adaptation pro- 
cess, the main component of the Controller is a learning technique, which is realised 
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as an adapted variant of Wilson's Learning Classifier System XCS [25]. This learning 
component is responsible for choosing the best currently known configuration for the 
observed situation. The automated learning is achieved by analysing the last actions 
and updating the measurements for the underlying rules. In cases where no appropri- 
ate rules are available, new classifiers need to be created. In contrast to the original 
XCS algorithm, the ONC-variant is not allowed to create new classifiers (pairs of sit- 
uation/conditions and parameters/actions) randomly by a Genetic Algorithm. Instead, 
control is transferred to Layer 2 of the architecture. 

Layer 2 is again designed using the Observer/Controller pattern: The Observer mon- 
itors Layer 1 and realises the need of a new classifier for a specific situation. The Con- 
troller part contains two basic components: a simulator and an Evolutionary Algorithm 
(EA) Q. The Controller creates an appropriate simulation scenario from the situation 
vector and triggers the EA to repeatedly evolve a number of parameter sets for the 
network protocol. These parameter sets are evaluated in the simulator. This bears the 
advantage that newly created parameter sets are not directly used in the live system, as 
this can cause the system to perform badly or even malfunction. Only those parameter 
sets that qualify in the simulator of Layer 2 are passed back to Layer 1 and may then be 
applied in the real world - thus, Layer 2 allows for a kind of sandbox-learning. 

The ONC approach as described before provides a black-box solution to control dif- 
ferent types of network protocols. The integration process of a new protocol into the 
ONC framework requires that an engineer has to fulfil three major tasks: Specify the 
performance metric, describe the situation (what are the dynamic factors defining the 
need of an adaptation, e.g. available neighbours and their positions in MANets) accom- 
panied by a distance function between two situations, and provide a simulation model 
to enable the simulation-based optimisation process of Layer 2. Within the following 
section, we describe how the ONC system is applied to MANet protocols. 

3.2 Application to Mobile Ad-Hoc Networks 

The focus of Layer is to integrate a new protocol into the framework. Therefore, the 
engineer has to describe its observation and control process leading to the need of two 
interfaces: one for accessing the protocol parameters and one for collecting informa- 
tion about the local status of the system. The former interface enables the framework to 
adapt the behaviour of the protocol which means the parameter settings can be adapted 
at runtime. In the latter interface, the engineer has to define what is relevant and in- 
fluences the protocol's performance - we call this the situation the system is in. In a 
MANet environment, the most important factor influencing the protocol's performance 
is the distribution of other nodes within its sending and sensing range. Therefore, a 
sector-based approach as depicted in Fig.[2]has been developed. The radius of the outer 
circle is equal to the sensing distance (sensDist) of the node, as this is the most remote 
point where messages of this node can interfere with other ones. Typically, the trans- 
mission range for Wireless-LAN based MANets is about 250 meter (half of the sensing 
distance). The radii of the inner circles have been chosen empirically. 

As nodes within the first circle are really close (50m), their exact position does not 
matter. The second circle (125m) has been partitioned into 4 sectors, the third circle 
(200m) into 8 sectors, and the forth circle (250m, maximum transmission range) into 
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Fig. 2. Environment representation. 

16 sectors. The next two circles (375m and 500m) are representing the area within sens- 
ing range - with both circles divided into 32 sectors each. We assume that a node is able 
to determine the current positions of its neighbours in sensing range relative to its own 
position (e. g. based on GPS, see lfl4ll ). Additionally, the node's direction of movement 
is needed since it has high influence on the best parameter set (e. g. moving to/away 
from a set of nodes influences the delay). This sector-based approach generalises situa- 
tions which is necessary to reduce the rule-generation effort. 

The Layer 1 component is responsible for the adaptation process and for increasing 
the system's performance by learning. Again, two aspects have to be considered: a 
learning feedback and a measurement to compare different situations. To enable the 
learning feedback, a local fitness or evaluation function is needed. Several metrics have 
been proposed for analysing MANet protocols at the global scope. Since our current 
focus is set on MANet-based broadcast algorithms, the standard metrics are Packet 
Delivery Ratio and Packet Latency - both cannot be measured locally at each node. But, 
the goal of the system is the same: we want to reduce the overhead, increase the delivery, 
and avoid large delays. Based upon the locally available information, this means to 
reduce the number of forwarded broadcasts and assure the delivery of the broadcast to 
each node at the same time. Therefore, we introduce the following formula: 

#RecMess 

#FwMess 

The variable x represents the particular network protocol instance. Since a new pa- 
rameter set has to be applied for a minimum duration to show its performance, we use 
evaluation cycles defining discrete time slots. The duration of the cycle depends on how 
dynamic an environment is: The faster it changes, the shorter is the cycle to be chosen. 
For the last evaluation cycle, the function takes into account the sum of all messages 
being forwarded by all of the neighbours and the node to be evaluated within the last 
evaluation cycle (#FwMess), and the sum of all messages being received by them 
(#RecMess). 
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The second aspect on Layer 1 is the comparison of situations: we need to quantify 
the distance of two situations. The goal is that more similar situations will receive a 
low distance value and those having low similarity will receive a high distance value. 
Based upon the sector-based situation description as introduced before, a measure for 
the similarity of two entities (A, B) can be defined. To be able to determine the distance, 
the possible influence of rotation and reflection are deducted initially. Afterwards, the 
formula for the distance (5) can be defined with r € RADII and s e SECTORS as 
follows: 

S(A,B) — /. /.(^-r,s — B r ^ s ) 2 /r. distance 

r s 

The function r. distance defines the radius size as introduced before (50m, 125m, . . . ). 
A rjS gives the number of neighbours within the sector s of radius r for the situation 
description A. This means that the importance of a node's neighbour decreases if it is 
situated within an outer radius. 

Finally, Layer 2 has to be able to build adequate simulation scenarios out of the in- 
formation obtained by Layer 1. In ONC, we use the standard network simulation tool 
NS-2 [4 1, but this can easily be exchanged by other solutions. The network simulator 
NS-2 has a large set of integrated or available standard protocols, but for recently devel- 
oped or proprietary protocols a simulation model probably does not exist. The engineer 
has to provide a realistic model (as it is also used in the development process) which 
can be adapted to the observed situation by generating an appropriate scenario. The 
adaptation of the scenario is done using NS-2's configuration interface considering the 
observed situation at the node. In our case, this means, a randomised instance of the 
sector-model is created defining the distribution of the neighbouring nodes along with 
the node's movement direction and transferred to NS-2 for simulation. After finishing 
the previously described tasks, ONC is able to control MANet-based protocols. 

3.3 Evaluation 

The ONC framework is implemented in JAVA. The moving agents communicating via 
the MANet protocol are simulated using the Multi- Agent Simulation Toolkit MASON 
I'll], with each agent's protocol instance representing a SuOC of the architecture as 
depicted in Fig.Q] The respective Layer 1 Controller is an adapted Learning Classifier 
System as described in lf24l . At Layer 2, the standard network simulation tool NS-2 
|4 ] is used to evolve new parameter sets in combination with a standard Genetic Algo- 
rithm (population size: 15, new children per iteration: 7, mutation rate: 0.2 per child, 
all children via crossover with fitness-based selection of parents). We use two different 
simulation tools in order to avoid having exactly the same conditions while optimising 
rules, since a complete copy of the current situation observed in the real environment 
within the simulator is not realistic. 100 agents have been created and applied to the 
simulated area, which has dimensions of 1000 x 1000 meter. The agents move accord- 
ing to a random- way point-model. The Physical/Mac layer is an IEEE 802. 1 1 in ad-hoc 
mode at 2 Mbps. 

To demonstrate the performance of ONC controlling MANets, we choose the 
Reliable Broadcast Protocol (R-BCast) as introduced in [10], since this protocol is 
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representative for the research field of reliable broadcast protocols in MANets. In order 
to achieve reliability and increase the packet delivery ratio compared to other protocols, 
additional effort is made by equipping the nodes with extra buffers. These round-robin 
based buffers are used to store the last p unique packets the particular nodes received. 
In contrast to other protocols, the R-BCast protocol has significantly more variable 
parameters and consequently the task to control the protocol is more complex, but it 
also offers a higher potential benefit due to a dynamic adaptation. The parameters be- 
ing subject to ONC control actions are: Delay (Maximum deceleration time between 
receiving and forwarding of a message), Allowed Hello-loss (Maximum number of 
Hello-messages, which may be lost until a node is assumed to be out of transmission 
range), Hellolnterval (Interval between two Hello-messages), S Hello-Interval (Ran- 
domises Hello-Interval), Packet count (Number of the last x stored NACK messages), 
and Minimum difference (Minimum difference between NACK messages). Details on 
the parameters and the protocol can be found in ifTOl . 

In order to analyse the performance of the ONC system, the simulation is repeated 
for two cases under the same restrictions and using the same seeds for the randomised 
values: a) all nodes are uncontrolled (no ONC system) and use the manually optimised 
standard configuration of the protocol, and b) all nodes have an own instance of the 
ONC system to control their protocol configuration. All values presented in the re- 
mainder of this section are averaged values received from three runs, where each run 
has a duration of 10,000 simulated seconds. During one run of the scenario, 17,400 
BCast-messages have been simulated. The learning component has been trained using 
10 complete runs with different seeds - leading to completely different movements of 
the nodes and along with these to different situations. 

The collected performance figures consist of two basic aspects: the local figures as 
considered for the learning component at each node (local fitness) and a network- wide 
view. The latter is the averaged value of local fitness values which is obtained as refer- 
ence only in the simulator. 

Figure [3] plots the system's performance considering only one node. The X-axis de- 
scribes the simulation time (in simulated seconds) and the Y-axis the measured fitness 
value. In principle, all simulated nodes show a comparable behaviour; this specific node 
has been explicitly chosen to demonstrate the typical differences between an ONC- 
controlled and an uncontrolled node. During the simulation, the node gets separated 
from the rest of the network (no other nodes within sending distance) between simula- 
tion seconds 7,350 and 7,700. Within this interval, the fitness is for both cases. But 
especially these situations demonstrate the benefit of ONC control: The delays have 
been lengthened so that the node receives more old messages when it arrives back in 
sending distance of another node resulting in a quicker recovery of the ONC -controlled 
system. 

Another observation that can be made considering Fig. [3] is the impact of the learn- 
ing module. To be able to learn, it has to be allowed to try different rules and not to 
use always the best matching one. E.g. at simulation second 1,800, the learning com- 
ponent tried a rule that results in a performance slightly worse than the standard pro- 
tocol configuration. These small drawbacks have to be taken into account to achieve 
an improvement for the system. Averaged over the complete simulation time (10,000 
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simulated seconds), the performance of the protocol instance has been enhanced in 
terms of the fitness function from 0.8270 (all nodes perform the standard protocol with- 
out any adaptation) to 0.8991 which is an increase of 8.71%. 

Figure [4] depicts the averaged performance of the network protocol instances on 
network-level. The averaging leads to the effect that separations of single nodes influ- 
ences the performance only slightly. Nevertheless, Fig. [4] shows two drops (simulation 
seconds 3,850 to 4,200 and 7,450 to 7,550). The first drop can be explained by a split 
of the set of nodes - about 30 nodes are not within sending distance of the rest. Here, 
two different networks have been established. Within the second drop, again, a larger 
group (18 nodes) has been separated from the rest of the network. Despite these separa- 
tion effects, the performance of the system has been increased. When all nodes perform 
just the standard protocol configuration without any adaptation, the resulting averaged 
fitness is 0.8760. The same simulation with additional ONC control for all nodes leads 
to an averaged fitness value of 0.9456 which is an increase of 7.94%. To get back to the 
intially named typical metrics like Packet Delivery Ratio (PDR), Packet Latency (PL), 
and Overhead, we can observe a significant improvement. The network- wide measured 
PDR is slightly increased (about 0.8%), while the PL stays nearly constant. But, the 
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number of forwarded messages and hence the total overhead in the network has been 
reduced - which is mainly responsible for the increase in terms of the fitness function. 

4 A Collaboration Mechanism for Knowledge Exchange 

The performance of the ONC system as described before depends strongly on the ex- 
istence of matching rules. Layer l's Controller selects rules based on a similarity mea- 
surement and applies them to the SuOC. This rule- set is only extended by Layer 2, if 
no matching rule exists. The creation of new rules using Layer 2 is a time-consuming 
process (on average 10 min to evolve one rule). Since MANet-nodes are homogeneous 
and create identical situation descriptions for identical situations, tasks for the Layer 2 
component can be transferred to other nodes. Thus, the following part of the chapter 
introduces a novel collaboration mechanism that allows for knowledge exchange (share 
rules between neighbours) and dynamic load balancing (share Layer 2 facilities). 

4.1 The Collaboration Mechanism 

The distributed collaboration mechanism consists of three parts: a local activation (Part 
1), a processing of the request by neighbouring nodes (Part 2), and a local analysis 
that finishes the initial request. For the whole process we assume that all nodes are 
aware of their direct neighbours and nodes are immediately answering the requests, so 
that a short delay time after sending the initial request is enough to ensure an accurate 
feedback. Additionally, it is assumed that all nodes want to collaborate and no bad 
messages or behaviour occur. 

Part 1: Local Activation. The first part of the mechanism is activated if Layer 1 's learn- 
ing component does not contain an adequate classifier for the currently observed situa- 
tion at one node. The initial version of ONC covers the problem by activating its Layer 2 
component to evolve a new rule. The collaboration unit encapsulates this Layer 2 mech- 
anism by involving the neighbours into the rule development process. The main target 
is to re-use existing knowledge from neighbours. Furthermore, the mechanism provides 
an instrument to balance the load of Layer 2 tasks within the network. The collabo- 
ration mechanism is message-based. Such a message exchanged between the nodes is 
composed of three parts: (a) the situation description (the condition part of a rule and 
consequently the corresponding setup information for the simulator at Layer 2), (b) the 
origin node's current queue length of Layer 2 tasks, and (c) a TimeToLive- value (TTL), 
indicating whether the message has to be broadcasted to neighbours further away or 
not: 

Message : \\HEADER\SituationDesciption\QueueSize\TTL\\ 

In Part 1 , the collaboration mechanism generates such a message and sends the request 
to its direct neighbours, which forward it according to the TTL- value. After initialising 
the request, the node waits a given delay time (during that the neighbours are processing 
Part 2) and continues with Part 3. The duration of the delay depends on the channel 
characteristics and can be estimated as follows: 2 x (Maximum sending duration) 
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xTTL + (processing constant). Without neighbours in transmission range, the task 
is directly added to the Layer 2 queue. 

Part 2: Process Request. The second part of the process is performed by all neighbours 
receiving the broadcast message from Part 1 (directly or forwarded). The TTL-part is 
handled by lower protocol layers - thus, the main task of a node is to check its rule 
base. Similar to Layer l's adaptation process, the learning component is asked whether 
it contains an adequate rule for the submitted situation description or not (currently 
queued tasks are taken into account). Existing rules are directly send to the inquirer. 
If no matching rule is available, the received queue-size is compared to the own one: 
For those cases where the received queue size is higher, the node reserves the spot and 
answers with an offer reflecting the next possibility for this node to perform the task. 
Otherwise, a reject is returned to the inquirer. 

Part 3: Finalise Mechanism. Finally, the initial node is able to complete the process 
by analysing the received answers. If the set of answers contains an appropriate rule, 
this is added to Layer l's rule-base. Otherwise, the received offers are considered: The 
best offer will be accepted. If neither a rule nor a suitable offer are available, the task is 
added to the own Layer 2 queue. 



4.2 Evaluation 

The evaluation of the collaboration mechanism is based on the NS-2 scenarios devel- 
oped by Kunz in [10] for the initial protocol. The first scenario contains 100 nodes and 
analyses the load balancing performance: All nodes are using their ONC system, but 
only 10 nodes are able to evolve new rules by using their Layer 2 (they might have 
a better power supply). If a node has no "active" node in its direct neighbourhood, it 
can increase the TTL- value to consider, e.g., a 2-hop neighbourhood. The initial rule 
base for each node is empty, which means the 10 active nodes have to evolve new rules 
for all 100 nodes. All results are averages of ten runs. The scenario is processed in 
three consecutive steps: (1) all 100 nodes determine their current situation (empty rule- 
base), (2) the collaboration mechanism takes place, and (3) the process is validated, i.e. 
the experiment is repeated with the newly acquired knowledge provided by the learn- 
ing mechanism. The learning mechanism was able to respond correctly to a previously 
learned situation which resulted in the same performance figures at both occasions. 
Since the population of the classifier system is initially empty, the standard parameter 
sets are used for the performance evaluation in step 1. In step 2 the queues of all 10 
active nodes have been processed and the rules distributed. The fitness calculation takes 
place after processing each step. The overall fitness (same function as in Sec. 13.21) of 
the system is 0.882 in the first step and increases to 0.926 in the second step (equal to 
step 3). On average, each active node needs 3 hours and 42 minutes to complete the 
queue of optimisation tasks (the maximum is 7h and 6min). Due to the abstraction of 
situations (some situation descriptions are so similar that they have been handled in the 
same way), an averaged number of 94.3 jobs have been performed in total. Each active 
node had to process 9.43 jobs on average (the most heavily loaded node had to process 
1 1.33 jobs). The different size of the queue can be explained by the distribution of the 
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Fig. 5. Messages in the second scenario. 



nodes within the environment - two active nodes had only a few neighbours (lowest: 5), 
while the other eight had to work for comparatively more nodes (up to 14). 

The experiment aims at determining the effort in terms of messages that have to 
be exchanged during the collaboration. The amount of messages sent system-wide is 
changing during the three consecutive steps: The total number of messages drops from 
17922.6 to 17400 on average in step 3. An averaged number of 522.6 messages has 
been simulated for the collaboration mechanism (Broadcasts 99.6, Unicasts 81.9, and 
replies 341.1) - which is only a low overhead compared to the 17400 normal BCast 
messages (2.9%). 

In the following part, we investigated the nodes' reactions on abruptly changing sit- 
uations. Therefore, a static setup of the scenario without node movements has been 
chosen and nodes were exchanged. Nodes are not moving to avoid the occurrence of 
new situations and force them to receive the particular rule from their neighbours. The 
exchange takes place by randomly choosing two nodes and switching their positions. 
The scenario consists of 25 steps and contains again 100 nodes, the broadcast distance 
is two hops. Initially, all nodes are active nodes (they learn the matching action for their 
particular situation). Since nodes are not moving, the environment is static and con- 
sequently the observed situations stay the same. Within each of the 25 steps, 10 pairs 
of nodes are randomly selected and their positions exchanged. Probably, these nodes 
have to receive a matching rule due to the changed situation - although their Layer 2 
is deactivated (at least the predecessor at that position has an appropriate rule). Fig- 
ure \5\ demonstrates the message traffic for all steps. The lines for the number of jobs 
decreases nearly to 0, which means that almost every exchange is followed by a trans- 
mission of the rule. In some cases a job is performed, which means that the exchanged 
pair is not available with a 2-hop broadcast. The number of responses (and also the ra- 
tio reply/request) increases continuously as nodes do not delete rules. Hence, the goal 
of the collaboration mechanism is fulfilled, since the exchange of rules works and the 
re-creation of rule is avoided. 
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5 Conclusions 

This chapter discussed the application of the Organic Network Control (ONC) system 
to the control of mobile ad-hoc network (MANet) protocols. Therefore, a brief overview 
of the system has been given, followed by a description of the adjustments needed to 
apply ONC to this type of protocols. Afterwards, the benefit of the additional ONC con- 
trol has been demonstrated using an exemplary MANet-broadcast protocol. We identi- 
fied the duration of the rule-generation (which is crucial for the safety-based learning 
approach) as a limiting aspect of the ONC approach. Hence, a dynamic collaboration 
mechanism has been introduced that decreases the number of performed rule genera- 
tions significantly and avoids idle times for Layer 2 of the architecture by dynamic load 
balancing. 

Besides more analysis on the behaviour of the ONC system in the presented setup, 
future work will mainly focus on demonstrating the generic character of the approach by 
applying it to other types of protocols. Furthermore, the learning and the optimisation 
component will be considered in detail with a special focus on comparing the currently 
used techniques to possible alternatives. 
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Abstract. In this work we propose a clustering algorithm that learns on-line a 
finite gaussian mixture model from multivariate data based on the expectation 
maximization approach. The convergence of the right number of components as 
well as their means and covariances is achieved without requiring any careful 
initialization. Our methodology starts from a single mixture component covering 
the whole data set and sequentially splits it incrementally during the expectation 
maximization steps. Once the stopping criteria has been reached, the classical EM 
algorithm with the best selected mixture is run in order to optimize the solution. 
We show the effectiveness of the method in a series of simulated experiments and 
compare in with a state-of-the-art alternative technique both with synthetic data 
and real images, including experiments with the iCub humanoid robot. 

Keywords: Image processing, Unsupervised learning, (Self- adapting) Gaussians 
mixtures, Expectation maximization, Machine learning, Clustering. 



1 Introduction 

Nowadays, computer vision and image processing are involved in many practical ap- 
plications. The constant progress in hardware technologies leads to new computing ca- 
pabilities, and therefore to the possibilities of exploiting new techniques, for instance 
considered to time consuming only a few years ago. Image segmentation is a key low 
level perceptual capability in many robotics related application, as a support function 
for the detection and representation of objects and regions with similar photometric 
properties. Several applications in humanoid robots [11], rescue robots |2], or soccer 
robots [6) rely on some sort on image segmentation [15 1. Additionally, many other 
fields of image analysis depend on the performance and limitations of existing image 
segmentation algorithms: video surveillance, medical imaging and database retrieval 
are some examples (4), E3 - Two main principal approaches for image segmentation 
are adopted: Supervised and unsupervised. The latter one is the one of most practical 
interest. It may be defined as the task of segmenting an image in different regions based 
on some similarity criterion among each region's pixels. 

One of the most widely used distributions is the Normal distribution. Due to the 
central limit theorem, any variable that is the sum of a large number of independent 
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factors is likely to be normally distributed. For this reason, the normal distribution is 
used throughout statistics, natural science, and social science as a simple model for 
complex phenomena. If we model the entire dataset by a mixture of gaussians, the clus- 
tering problem, subsequently, will reduce to the estimation of the gaussians mixture's 
parameters. 

1.1 Related Work 

Expectation-Maximization (EM) algorithm is the standard approach for learning the 
parameters of the mixture model [8]. It is demonstrated that it always converges to a lo- 
cal optimum [3 ]. However, it also presents some drawbacks. For instance, EM requires 
an a-priori selection of model order, namely, the number of components to be incor- 
porated into the model, and its results depend on initialization. The higher the number 
of components within the mixture, the higher will be the total log- likelihood. Unfortu- 
nately, increasing the number of gaussians will lead to overfitting and to an increase of 
the computational burden. 

Particularly in image segmentation applications, where the number of points is in 
the order of several hundred thousand, finding the best compromise between precision, 
generalization and speed is a must. A common approach to choose the number of com- 
ponents is trying different configurations before determining the optimal solution, e.g. 
by applying the algorithm for a different number of components, and selecting the best 
model according to appropriate criteria. 

Adaptive mixture models can solve the problem of the original EM's model selec- 
tion. It was originally proposed in 2000 by Li and Barron |[T0lL and subsequently ex- 
plored in 2003 by Verbeek et al. in [14]. They developed a deterministic greedy method 
to learn the gaussians mixture model configuration [14]. At the beginning a single com- 
ponent is used. Then, new components are added iteratively and the EM is applied until 
it reaches the convergence. 

Ueda et Al. proposed a split-and-merge EM algorithm to alleviate the problem of lo- 
cal convergence of the EM method [13]. Subsequently, Zhang et Al. introduced another 
split-and-merge technique [ 16 1. Merge an split criterion is efficient in reducing number 
of model hypothesis, and it is often more efficient than exhaustive, random or genetic 
algorithm approaches. To this aim, particularly interesting is the method proposed by 
Figueiredo and Jain, which goes on step by step until convergence using only merge 
operations Q. 

1.2 Our Contribution 

We propose an algorithm that simultaneously determines the number of components 
and the parameters of the mixture model with only split operations. In [7 ] we previ- 
ously proposed a split and merge technique for learning finite Gaussian mixture mod- 
els. However, the principal drawbacks were the initialization, with particular regards 
to the beginning number of mixture classes, and the superimposition of the split and 
merge operations. The particularly of our new model is that it starts from only one 
mixture component progressively adapting the mixture by splitting components when 
necessary. 
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In a sense, we approach the problem in a different way than [5]. They start the com- 
putation with the maximum possible number of mixture components. Although that 
work is among the most effective to date, it becomes too computationally expensive for 
image segmentation applications, especially during the first iterations. It starts with the 
maximum number of components, decreasing it progressively until the whole space of 
possibilities has been explored, whereas our method starts with a single component and 
increases its number until a good performance is attained. 

1.3 Outline 

The paper is organized as follows. In sec. [5] we introduce the proposed algorithm. 
Specifically, we describe its formulation in sec. 13.11 the initialization in sec. 13.21 the 
component split operation in sec. 13.41 and the decision thresholds update rules in sec. 
13.51 Furthermore, in sec. [4] we describe our experimental set-up for testing the validity 
of our new technique and in sec. Owe discuss our results. Finally, in sec.[6]we conclude. 

2 Expectation Maximization Algorithm 

2.1 EM Algorithm: The Original Formulation 

A common usage of the EM algorithm is to identify the "incomplete, or unobserved 
data" y = (y 1 , y 2 , . . . , y k ) given the couple (X, y) - with x defined as X = {x 1 , x 2 , 
. . . , x k }, also called "complete data", which has a probability density (or joint distri- 
bution) p(X, y\$) = p$(X, y) depending on the parameter $. More specifically, the 
"complete data" are the given input data set X to be classified, while the "incomplete 
data " are a series of auxiliary variables in the set y indicating for each input sam- 
ple which mixture component it comes from. We define E (•) the expected value of a 
random variable, computed with respect to the density p$(x, y). 

We define Q(¥ n \¥ n -^) = E' L(&), with L(#) being the log-likelihood of the 
observed data: 

L($) = \ogpt(x,y) CD 

The EM procedure repeats the two following steps until convergence, iteratively: 

- E-step: It computes the expectation of the joint probability density: 

Q(& n \$ n -V) =E[\ogp{X,y\^ n - 1) )} (2) 

- M-step: It evaluates the new parameters that maximize Q: 

$(n+i) = a^maxQ^T,^- 1 )) (3) 

The convergence to a local maxima is guaranteed. However, the obtained parameter 
estimates, and therefore, the accuracy of the method greatly depend on the initial pa- 
rameters $° . 



108 N. Greggio, A. Bernardino, and J. Santos- Victor 

2.2 EM Algorithm: Application to a Gaussians Mixture 

When applied to a Gaussian mixture density we assume the following model: 

nc 



c=l (4) 

Pc (x) = 2 r^" 

(2tt)*|2; c |* 



1 -\(x-p. c ) T \E c \- 1 {x-p. c ) 



where p c (x) is the component prior distribution for the class c, and with d, £l c and U c 
being the input dimension, the mean and covariance matrix of the gaussians component 
c, and nc the total number of components, respectively. 

Consider that we have nc classes C nc , with p(x\C c ) — p c (x) and P (C c ) = w c 
being the density and the a-priori probability of the data of the class C c , respectively. In 
this setting, the unobserved data set y = (^ 1 , y 2 , . . . , y N ) contains as many elements 

as data samples, and each vector y % = [y\, y\, • • • , y 1 ^ • • • y z nc \ is such that y l c = 1 if 
the data sample x* belongs to the class C c and y\ = otherwise. The expected value of 
the c t/l component of the random vector y is the class C c prior probability E (y c ) = w c . 
Then the E and M steps become, respectively: 



E-step: 



P(y i c = l\x i )=P(C c \x i ) 

_p(x*\C c ).P(C c ) _ wcpcffl 



P(X 1 ) YTJLlWc'Pcix 1 ) 



(5) 



= 7r„ 



For simplicity of notation, from now on we will refer to E (y c \x z ) as tt z c . This is prob- 
ability that the x k belongs to the class C c . 

M-step: 



(6) 



Finally, a-priori probabilities of the classes, i.e. the probability that the data belongs to 
the class c, are reestimated as: 

1 N 
™c n+1) = ^I>c, with c={l,2,...,nc} (7) 
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3 FASTGMM: Fast Gaussian Mixture Modeling 

Our algorithm starts with a single component and only increments its number as the 
optimization procedure progresses. With respect to the other approaches, our is the one 
with the minimal computational cost. 

The key issue of our technique is looking whether one or more gaussians are not 
increasing their own likelihood during optimization. Our algorithm evaluates the current 
likelihood of each single component c as ®: 

N 
A CU rr(c) (#) = ^2 l ° g ( Wc ' Pc ( S *)) ^ 

2=1 

In other words, if their likelihood has stabilized they will be split into two new ones and 
check if this move improves the likelihood in the long run. For our algorithm we need 
to introduce a state variable related to the state of the gaussian component: 

- Its age, that measures how long the component's own likelihood does not increase 
significantly (see sec. 13. Ik 

Then, the split process is controlled by the following adaptive decision thresholds: 

- One adaptive threshold Ath for determining a significant increase in likelihood 
(see sec. 13.5k 

- One adaptive threshold Ath for triggering the split process based on the compo- 
nent's own age (see sec. 13.5k 

- One adaptive threshold ^th for deciding to split a gaussian based on its area (see 
sec. [331). 



It is worth noticing that even though we consider three thresholds to tune, all of them 
are adaptive, and only require a coarse initialization. 

These parameters will be fully detailed within the next sections. 

3.1 FASTGMM Formulation 

Our algorithm's formulation can be summarized within three steps: 

- Initializing the parameters; 

- Splitting a gaussian; 

- Updating decision thresholds. 

Each mixture component c is represented as follows: 

$c = Q (w c ,fic, Eciic, Alast(c)i ^curr{c)-> a c) (9) 

where each element is described in tab. I. In the rest of the paper the index notation 
described in tab. I will be used. 

Here, we define two new elements, the area (namely, the covariance matrix determi- 
nant) and the age of the gaussians, which will be described later. 
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Table 1. Symbol notation used in this paper. 



Symbol 


Element 


W c 


a-priori probabilities of the class c 


Jic 


mean of the gaussian component c 


Zc 


covariance matrix of the gaussian component c 


£c 


area of the gaussian component c 


Alast(c) 


log-likelihood at iteration t — 1 of the gaussian component c 


A curr(c) 


log-likelihood at iteration t of the gaussian component c 


a c 


age of the gaussian component c 


c 


single mixture component 


nc 


total number of mixture components 


i 


single input point 


N 


total number input points 


d 


single data dimension 


D 


input dimensionality 



During each iteration, the algorithm keeps memory of the previous likelihood. Once 
the re-estimation of the vector parameter $ has been computed in the EM step, our 
algorithm evaluates the current likelihood of each single component c as: 

If cii overcomes the age threshold Ath (i.e. the gaussians i does not increase its own 
likelihood for a predetermined number of times significally - over Ath), the algorithm 
decides whether to split this gaussians depending on whether their own single area 
overcome £th • 

The whole algorithm pseudocode is shown in Algorithm[T] 



Algorithm 1. FASTGMM: Pseudocode. 



l 

2 
3 
4 
5 
6 

7 

8 

9 

10 



11 
12 
13 
14 

15 
16 
17 

18 

19 

20 



Parameter initialization; 
while (stopping criterion is not met) do 



A c 



, evaluation, for c = , 1 , 



Whole mixture log-likelihood L (#) evaluation; 

Re-estimate priors w c , for c = 0, 1, . . . , nc; 

Recompute center p,(, n+1 > and covariances E^J 1 ^ 1 ' , for c = 0, 1, 

- Evaluation whether changing the gaussians distribution structure; 

for ( c = to nc) do 

if(a c > A TH )then 



if ((A 



st(c) ) < yl Tj Ff)then 



rr(c) - 
a c + = 1; 

- General condition for changing satisfied; now checking 
those for each component; 
iS(E c > £th) then 

if f c < max NumComponents) then 
split gaussians — » split; 



nc-\- = 1; 

reset £ T h <- 
reset Ath ^~ Ltf 
reset a a, o>b <— ■ 
gaussians; 
return; 



£,th — £,th • (1 + ol • £); 



^TH-INIT . 



-INIT, 

with A, B being the new two 



Optional: Optimizing selected mixture; 
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3.2 Parameters Initialization 

The decision thresholds (-) INIT will be initialized as follows: 

^TH-INIT = idata] 

Lth-init = ^LTH\ (10) 

AtH-INIT = kATH 

with kLTH an d ^ath (namely, the minimum amount of likelihood difference between 
two iterations and the number of iterations required for taking into account the lack of 
a likelihood consistent variation) relatively low (i.e. both in the order of 10, or 20). Of 
course, higher values for kLTH an d smaller for kATH gi ye ri se to a faster adaptation, 
however adding instabilities. 

At the beginning, before starting with the iterations, ^th will be automatically ini- 
tialized to the Area of the whole data set - i.e. the determinant of the co variance matrix 
relative to all points, as follows: 



N 

N^ Xd (U) 

^data,i = \%i ~ J^data)\^i ~ fidata) 



f^data,d — at / j X d 



where N is the number of input data vectors x, and D their dimensionality. 

3.3 Gaussian Components Initialization 

The algorithm starts with just only one gaussian. Its mean will be the whole data mean, 
as well as its covariance matrix will be that of the whole data set. 
That leads to a unique starting configuration. 

3.4 Splitting a Gaussian 

When a component's covariance matrix area overcomes the maximum area threshold 
£th it will split. As a measure of the area we adopt the matrix's determinant. This, in 
fact, describes the area of the ellipse represented by a gaussian component in 2D, or the 
volume of the ellipsoid represented by the same component in 3D. 

It is worth noticing that the way the component is split greatly affects further com- 
putations. For instance, consider a 2-dimensional case, in which an elongated gaussian 
is present. Depending on the problem at hand, this component may approximating two 
components with diverse configurations: Either covering two smaller data distribution 
sets, placed along the longer axis, or two ovelapped sets of data with different co vari- 
ances, etc. A reasonable way of splitting is to put the new means at the two major 
semi-axis' middle point. Doing so, the new components will promote non overlapping 
components and, if the actual data set reflects this assumption, it will result in faster 
convergence. 

To implement this split operation we make use of the singular value decomposition. 
A rectangular nxp matrix A can be decomposed as A = USV T , where the columns 
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of U are the left singular vectors, S (which has the same dimension as A) is a diagonal 
matrix with the singular values arranged in descending order, and V T has rows that are 
the right singular vectors. However, we are not interested in the whole set of eigenval- 
ues, but only the bigger one, therefore we can save some computation by evaluating 
only the first column of U and the first element of S. 

More precisely, A gaussian with parameters $old will be split in two new gaussians 
A and B, with means: 

Sold = USV T 

Umax = E/*,i; smax = £1,1 Q2) 

1 1 

Ha = Hold + -smaxumax', Hb = Hold — -smaxumax 

where Umax is the first column of U, and smax the first element of S. 
The covariance matrices will then be updated as: 

Si,i = \smax\ Za = Zb = USV t (13) 

while the new a-priori probabilities will be: 

1 1 

W A = -WOLD] W B = -WOLD ( 14 ) 

The decision thresholds will be updated as explained in sec. 13.51 
Finally, their ages, a a and a#, will be reset to zero. 

3.5 Updating Decision Thresholds 

The decision thresholds are updated in two situations: 

A. When a mixture component is split; 

B. When each iteration is concluded. 

These two procedures will be explained in the following. 

- Single Iteration: The thresholds Ath, and ^th vary at each step with the following 
rules: 

Ath = Ath 9 • Ath — Ath • ( 1 9 

t t OlMAX t t ( a a MAX^ 
t,TH = t,TH 5— • STH = STH ' U 9 



with nc is the number of current gaussians, A, and olmax are the coefficients for the 
likelihood and area change evaluation, respectively. Using high values for A, and olmax 
results in high convergence speed. However, a faster convergence is often associated 
to instability around the optimal point, and may lead to a divergence from the local 
optimum. We can say that olmax can be interpreted as the speed the mixture compo- 
nents are split. In normal conditions, ^th will become closer to the area of the bigger 
component's determinant step-by-step at each iteration. Then, it will approach the split 
threshold, allowing the splitting procedure. 
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Following an analog rule, Ath will decrease step by step, approaching the current 
value of the global log-likelihood increment. This will allow the system to avoid some 
local optima, by varying its configuration whether a stationary situation occurs. More- 
over, dividing A and olmax by the square of nc consents to reduce the variation of the 
splitting threshold according to the number of components increases with a parabolic 
curve. This favorites the splitting when a low number of components is present, while 
avoiding a diverging behavior in case of an excessive amount of splitting operations. 

Finally, every time a gaussians is added these thresholds will be reset to their initial 
value (see next section). 

- After Gaussian Splitting: The decision thresholds will be updated as follows: 

CTH T Af TT 

£,th = ; Ath = Lth-init (16) 

nc 

where ucold and nc are the previous and the current number of mixture compo- 
nents, respectively. Substantially, this updates the splitting threshold to a value that 
goes linearly with the initial value and the actual number of components used for the 
computation. 

3.6 Optimizing the Selected Mixture 

This is an optional procedure. Once the best, or chosen, mixture, is saved, there are two 
possibilities: 

1 . Keeping the chosen mixture as the final result; 

2. Optimizing the chosen mixture with the original EM algorithm. 

The first one is the fastest but less accurate, while the second one introduces new com- 
putations ensuring more precision. It may happen that FASTGMM decides to increase 
the number of components even though the EM has not reached its local maximum, due 
to the splitting rule. In this case current mixture can still be improved by running the 
EM until it achieves its best configuration (the log-likelihood no longer increases). 

Whether applying the first or second procedure is a matter of what predominates in 
the "number of iterations vs. solution precision" compromise at each time. 

3.7 Computational Complexity Evaluation 

We refer to the pseudocode in algorithmQ] and to the notation presented in sec. l3.1I The 
computational burden of each iteration is: 

- the original EM algorithm (steps Q] to Q]) takes o (n ■ d ■ nc) for each step, for a total 
of o (4 • n ■ d ■ nc) operations; 

- our algorithm takes o (nc) for evaluating all the gaussians (step Q] to [B; 

- our split (stepHJ operation requires o (D) . 

- the others take o (l) . 

- the optional procedure of optimizing the selected mixture (step Q]) takes 
o (4 • n ■ d ■ nc), being the original EM. 
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Therefore, the original EM algorithm takes: 

- o (4 • n ■ d ■ nc) , while our algorithm adds o (d ■ nc) on the whole, or o (4 • n ■ d ■ nc) , 

giving rise tO O (4 • N ■ D ■ nc) + O (D ■ nc) = O (4 • N ■ D ■ nc + D ■ nc) = (nc ■ D ■ (47V + 1)) 

in the first case; 

- 2 • O (4 • N ■ D ■ nc) + O (D ■ nc) = O (8 • N ■ D ■ nc + D ■ nc) = (nc ■ D ■ (8N + 1)) in the Second 

case, with the optimization procedure. 

Considering that usually D « N and nc « N, and that the optimization procedure 
is not essential, our procedure does not add a considerable burden, while giving an im- 
portant improvement to the original computation in terms of self-adapting to the data 
input configuration at best. Moreover, it is worth noticing that even though the optimiza- 
tion procedure is performed, this starts very close to the optimal mixture configuration. 
In fact, the input mixture is the result of the FASTGMM computation, rather than a 
generic random or k-means initialization (as it happens with the simple EM algorithm, 
generally). 

4 Experiments 

Since now we use the following notation: 

- FASTGMM: Our algorithm; 

- FIGJ: 0. 

4.1 Synthetic Data 

We tested it by classifying different input data sets randomly generated by a known 
gaussians mixture. The same input sets have been proposed to |5|. Each distribution 
has a total of 2000 points, but arranged with different mixture distributions, with 3, 4, 
8, and 16 Gaussian components. 

The output of the two algorithms is shown in Fig. 1. Each subplot set is composed 
by the graphical output representation for the 2-D point distribution (top) and the 3- 
D estimation mixture histogram (bottom). The data plots show the generation mixture 
(blue) and the evaluated one (red). On the left the data result from our approach is 
shown, while on the right those of 0, relative to the same input data set. Moreover, the 
3D histograms at the bottom in each subfigure represent: The generated mixture, our 
algorithm's estimated one, and that estimated by |5], respectively. 

We can see that our algorithm is capable to learn the input data mixture starting from 
only one component with an accuracy comparable with those of 0. 

4.2 Colored Real Images 

We segmented the images as 3 -dimensional input in the (R,G,B) space. The color image 
segmentation results are shown in Fig. 2. The set of images is divided into two groups: 
Some general images, on the left (from (1) to (6)), and some images taken by the iCub's 
cameras, on the right (from (7) to (12)). For each group we show the original images, 
those obtained with [5 1, and those obtained with our algorithm on the left, in the middle, 
and on the right, respectively. 



Unsupervised Learning of Finite Gaussian Mixture Models (GMMs) 115 



N on- Overlapping mixtures 



3-gaussians 

FASTGMM FIGJ 



(ft, @ 






FIGJ 

m ' @ 

9 @ 
@ - *> 

..&■" /Mi' 



Overlapping mixtures 



8-gaussians 



16-gaussians 






Fig. 1. For each plot set: Generation mixture (blue) and the evaluated one (red) for FASTGMM 
and FIGJ on the same input sets. Moreover, the 3D histograms at the bottom in each subfig- 
ure represent: The generated mixture, our algorithm's estimated one, and that estimated by 0, 
respectively. 



5 Discussion 

5.1 Synthetic Data 

In table[2]the results of FASTGMM and FIGJ applied to the selected images are shown. 

Table 2. Experimental results on 2D synthetic data. 



Inpu, 


Algorithm 


# Inhilal 


# Detected 


Actual gaussian 
number 


# Iterations 


Elapsed Time 
[s] 


Dill lime FASTGMM 
with opt vs FIGJ% 


Log-likelihood 


Difflik FASTGMM 
with opt vs FIGJ % 


Normalized L2 Distance 
without optimization 


Normalized 1,2 Distance 
with optimization 


Clashed 


3 


FSAEM 


• 


3 


3 


76 


3.99716 


-53.89844289 


-8420.917867 


0.495867477 


5.770135 


3.918034 


no 


Optimization 


3 


130 


6.151567 


-8379.161274 


FSAEM + Opt. 


3 


206 


10.148727 


-8379.161274 


FIGJ 


16 


3 


277 


29.433288 


-9524.692099 


3.670464 


3.670464 


no 


4 


FSAEM 


1 


4 


4 


101 


5.615204 


-123.166389 


-7573.101881 


2.218687212 


10.670613 


0.07519 


no 


Optimization 


4 


186 


12.531248 


-7405.078438 


FSAEM + Opt. 


4 


287 


18.146452 


-7405.078438 


FIGJ 


16 


4 


205 


13.52505 


-8729.761818 


0.076403 


0.076403 


no 


8 


FSAEM 


• 


9 


8 


276 


5.750431 


22.99575458 


-8599.51 


0.015582283 


0.196817 


1.971166 


no 


Optimization 


9 


199 


4.428076 


-8598.17 


FSAEM + Opt. 


9 


475 


10.178507 


-8598.17 


FIGJ 


16 


7 


333 


48.156629 


-9798.154848 


0.14491 


0.14491 


no 


16 


FSAEM 


1 


16 


16 


501 


26.667825 


51.82061529 


-8165.436422 


0.057038433 


0.251515 


1.033934 


no 


Optimization 


16 


202 


12.848394 


-8160.778985 


FSAEM + Opt. 


16 


703 


39.516219 


-8160.778985 


FIGJ 


20 


13 


363 


63.740854 


-9540.91802 


2.98916 


2.98916 


no 



5.1.1 Evaluated Number of Components 

There are substantially no differences in the selected number of components. Both our 
approach and [ 5 ] perform well on low mixture components, while having the tendency 
of underestimating the best number when it increases, with exception for our approach 
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Synthetic or generic images 



Original Image 



FSAEM Original Image FIGJ 
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Fig. 2. Color images segmentation. From image (1) to (6) we tested the algorithms on well- 
known images, or synthetic ones, and from (7) to (12) we exploit the algorithms possibilities on 
real images captured by our robotic platform iCub's cameras. Note: FIGJ has not been able to 
segment image (13) also starting with merely 2 components, due to some internal covariances 
ill-posedness problems. 

that overestimates it with the 8 -component synthetic data and acting exactly with the 
16-components. However, it is worth considering that even though it approaches the 
actual number correctly, this not necessary means that the components are in the right 
place. For instance, two components may be regarded as only one, while a single one 
can be considered as a multiple one. Nonetheless, it happens for both algorithm (see 
Fig. 1), suggesting that a perfect algorithm is hard to find. 



5.1.2 Elapsed Time 

It is important to distinguish the required number of iterations from the elapsed time. 
FASTGMM employs fewer iterations than FIGJ without making use of the optimization 
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Table 3. Experimental results on real images segmentation. 



Input 


Algorithm 


# Initial 
gaussians 


# Detected 
gaussians 


# Iterations 


Elapsed Time 

[s] 


Diff time FASTGMM 
with opt vs FIGJ % 


Log-likelihood 


Diff lik FASTGMM 
with opt vs FIGJ 


Diff lik FASTGMM 
with opt vs FIGJ 


Crashed 


1 


FASTGMM 


1 


9 


551 


71.460507 


130.9699636 


-235130.6216 


0.186374662 


17.03330409 


no 


Optimization 


9 


23 


22.131293 


-234692.3977 


FASTGMM + Opt. 


9 


700 


93.5918 


-234692.3977 


FIGJ 


16 


16 


422 


307.454885 


-274668.2675 


yes, 17 


2 


FASTGMM 


1 


9 


426 


68.31949 


2.854809074 


-301594.6385 


6.29984E-09 


39.70671245 


no 


Optimization 


9 


3 


1.950391 


-301594.6384 


FASTGMM + Opt. 


9 


429 


70.269882 


-301594.6384 


FIGJ 


19 


19 


308 


441.170848 


-421347.9543 


yes, 20 


3 


FASTGMM 


1 


14 


426 


80.365553 


109.5710584 


-314931.44 


0.120630946 


11.16531396 


no 


Optimization 


14 


374 


88.057387 


-314551.5352 


FASTGMM + Opt. 


14 


800 


168.422941 


-314551.5352 


FIGJ 


30 


25 


572 


1611.816189 


-349672.2017 


no 


4 


FASTGMM 


1 


12 


451 


75.660802 


28.28196296 


-235735.0604 


0.00245062 


28.67314092 


no 


Optimization 


11 


103 


21.39836 


-235729.2834 


FASTGMM + Opt. 


11 


554 


97.059162 


-235729.2834 


FIGJ 


18 


19 


329 


500.708511 


-303320.2731 


yes 19 


5 


FASTGMM 


1 


13 


451 


80.165239 


72.03688746 


-311585.8059 


0.134313369 


10.60749104 


no 


Optimization 


13 


246 


57.748543 


-311167.3045 


FASTGMM + Opt. 


13 


697 


137.913782 


-311167.3045 


FIGJ 


30 


27 


611 


2110.020131 


.344174.3484 


no 


6 


FASTGMM 


1 


7 


276 


34.977055 


16.7543208 


-226138.1494 


3.34203E-05 


17.57803149 


no 


Optimization 


7 


36 


5.860168 


-226138.0738 


FASTGMM + Opt. 


7 


312 


40.837223 


-226138.0738 


FIGJ 


16 


16 


420 


272.227272 


-265888.6956 


yes, 17 


7 


FASTGMM 


1 


7 


576 


72.744922 


3.588007146 


-281176.5478 


1.30736E-06 


15.72288044 


no 


Optimization 


7 


14 


2.610093 


-281176.5441 


FASTGMM + Opt. 


7 


590 


75.355015 


-281176.5441 


FIGJ 


11 


11 


267 


106.453566 


-325385.5959 


yes, 12 


8 


FASTGMM 


1 


8 


426 


54.119749 


11.97025322 


-205718.9337 


7.50364E-05 


26.47107115 


no 


Optimization 


8 


46 


6.478271 


-205718.7793 


FASTGMM + Opt. 


8 


472 


60.59802 


-205718.7793 


FIGJ 


11 


11 


228 


90.200195 


-260174.7438 


yes, 12 


9 


FASTGMM 


1 


4 


251 


25.685892 


5.220795914 


-211551.64 





15.41441064 


no 


Optimization 


4 


3 


1.341008 


-211551.64 


FASTGMM + Opt. 


4 


254 


27.026901 


-211551.64 


FIGJ 


13 


13 


313 


137.77516 


-244161.0785 


yes, 14 


10 


FASTGMM 


1 


3 


201 


18.822216 


34.15089913 


-230138.8367 


0.000990588 


14.70275567 


no 


Optimization 


3 


63 


6.427956 


-230136.557 


FASTGMM + Opt. 


3 


264 


25.250172 


-230136.557 


FIGJ 


14 


13 


275 


106.626624 


-263972.9727 


yes, 14 


11 


FASTGMM 


1 


11 


451 


67.534808 


47.35502469 


-210899.0185 


0.114549212 


17.68706256 


no 


Optimization 


10 


180 


31.981125 


-210657.4353 


FASTGMM + Opt. 


10 


631 


99.515933 


-210657.4353 


FIGJ 


24 


22 


514 


624.913104 


-247916.5477 


no 


12 


FASTGMM 


1 


3 


151 


16.899695 


13.4646217 


-218447.7912 


2.96364E-06 


16.0195988 


no 


Optimization 


3 


23 


2.27548 


-218447.7848 


FASTGMM + Opt. 


3 


174 


19.943671 


-218447.7848 


FIGJ 


12 


12 


260 


130.416222 


-253442.2435 


yes, 13 



process, while more in the other case. At a first glance, this may suggest a whole FAST- 
GMM slower computation than FIGJ. However, the whole elapsed time that occurs for 
running our procedure is generally less than FIGJ's. Nevertheless, we made FIGJ start- 
ing with a reasonable number of components, just a few more than the optimum, so that 
they do not affect its performance negatively. FASTGMM 's better performance is due 
to the fact that our approach, growing in the number of components, computes more 
iterations than FIGJ but with a small number of components per iteration. Therefore it 
runs each iteration faster, while slowing only at the end due to the augmented number 
of components. 



5.1.3 Mixture Precision Estimation 

It is possible to see that FASTGMM usually achieves a higher final log-likelihood than 
FIGJ. This suggests a better approximation of the data mixture. However, a higher log- 
likelihood does not strictly imply that the extracted mixture covers the data better than 
another one. This is because it is based on the probability of each component, which 
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may be more or less exact, being not deterministic. Nevertheless, it is a good index on 
the probability that such mixture would be better. 

A deterministic approach is to adopt a unique distance measure between the genera- 
tion mixture and the evaluated one. In [9 ] Jensen etAL exposed three different strategies 
for computing such distance: The Kullback-Leibler, the Earh Mover, and the Normal- 
ized L2 distance. The first one is not symmetric, even though a symmetrized version 
is usually adopted in music retrival. However, this measure can be evaluated in a close 
form only with mono-dimensional gaussians. The second one also suffers analog prob- 
lems of the latter. The third choice, finally is symmetric, obeys to the triangle inequality 
and it is easy to compute, with a comparable precision with the other two. We then used 
the last one. Its expression states Q: 

z c N x (/Z c , S c ) = N x (p a , E a ) ■ N x {fi h , E h ) 
where 

5 c =(r- 1 + r 6 - 1 )" 1 and p c = £ c (S- Va + S^fib) (17) 

5.2 Colored Real Images 

It is salient to report that in [ 5 ] it has not been performed any experiment on real images 
segmentation. In fact, their result only concern different families of synthetic data. Con- 
trariwise, we want to focus more on image processing, due to its relevant importance in 
several different scientific fields, like robotics and medicine, as mentionned within the 
introduction. 

As we pointed out in the previous section (sec. l4.2K to compare a detected mixture 
versus a generic image is not possible quantitatively, only qualitatively. This is due to 
the high number of colors present within the image. It is obvious that with more com- 
ponents the image is better reconstructed. However, it is possible to visually recognize 
a pattern even with fewer components, although with less accuracy. Therefore, what 
algorithm gives the best result is again a matter of what compromise is better, in terms 
of computational complexity and result accuracy. 

Moreover, we have to report a problem that has not been addressed in the original 
work [5 ]. This crashes with some images when the number of components increases too 
much. Than means that it is not able to finish the computation. Therefore, as reported on 
tab.[3]we had to start it with a relatively low number of gaussians. However, even though 
it is able to finish the computation, it very often returns a mixture having the same 
starting number of components as the best one. Moreover, it explores the whole solution 
space, from the input mixture to one with only one element. This makes pointless the 
usage of such approach: Segmenting an image with the original EM instead of |5 1 will 
give the same result in less time. 
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5.3 FASTGMM Optimization Procedure 

We reported our results both using the optimization procedure, and not. Since one of 
the most prominent key feature of our approach is its fast computation, together with 
its simple implementation, the optimization process may seem worthless or too much 
time demanding. However, by comparing its performances against those of Q, our 
algorithm still remain faster (see sec. 5.1.2). The difference in terms of final mixture 
precision is not so evident at a first glance, both referring to the final log-likelihood and 
to the normalized L2 distance, although present. Nevertheless, the required time for 
the EM optimization step is important, since sometimes it approaches (and overcome) 
the splitting part. Here, selecting whether optimizing or not is merely a question of 
performance requiring. If one claims for the fastest algorithm it is advisable to not use 
the optimization, even though it may lead to some improvements to the final mixture. 
Otherwise, FASTGMM gives a good precision maintaing a better computational burden 
than FIGJ. 

5.4 Limitation of the Proposed Algorithm 

The bigger issue with our approach is the a max parameter tuning. This cause FAST- 
GMM being less general than FIGJ in input domain. If olmax is too small the input 
description may be underestimated, or overestimated if it is too high. This does not 
mean that it cannot perform in general purposes, but only that it has to be tuned for 
getting precise results. However, this makes FASTGMM suitable for a first data de- 
scription due to its great velocity. Once a first input analysis has been performed, it can 
be fine tuned to have a better data description. Moreover, we demonstrated that if well 
tuned, FASTGMM is able to segment the input data even better than FIGJ. 

6 Conclusions 

In this paper we proposed a unsupervised algorithm that learns a finite mixture model 
from multivariate data on-line. The algorithm can be applied to any data mixture where 
the EM can be used. We approached the problem from the opposite way of |5], i.e. by 
starting from only one mixture component instead of several ones and progressively 
adapting the mixture by adding new components when necessary. Our algorithm starts 
from a single mixture component and sequentially growing both increases the number 
of components and adapting their means and covariances. Therefore, its initialization is 
unique, and it is not affected by different possible starting points like the original EM 
formulation. Moreover, by starting with a single component the computational burden 
is low at the beginning, increasing only whether more components are required. Finally, 
we presented the effectivity of our technique in a series of simulated experiments with 
synthetic data, artificial, and real images, and we compared the results against Q. 
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Abstract. The aim of the work discussed in this paper is to introduce a method 
for verifying temporal requirements of time-constrained systems. The method 
predates by establishing a generic repository of observation patterns relative to 
a new time constraint taxonomy that we define. The method allows the auto- 
mated verification of temporal requirements, initially expressed in a semi-formal 
formalism -UML State Machines (SM) with time annotations- through model 
transformation and model-checking technique. In practice, in order to check the 
temporal aspects of a given specification, the observation patterns relative to the 
extracted requirements are instantiated to obtain appropriate observers. Then, us- 
ing a transformation algorithm, the system specification as well as the obtained 
observers are translated into Timed Automata (TA) models. Thereby, the veri- 
fication is reduced to a reachability analysis within the global model bringing 
together the system under study and the obtained observers. 

Keywords: Time-constrained System, Verification and validation, UML state 
machines, Timed automata, Model transformation. 



1 Introduction 

Given their practical implication on the safety and correctness of critical applications 
(e.g. transportation systems, nuclear plants), specification and verification are among 
the most important research topics in critical systems engineering, since such systems 
must achieve a high level of robustness and reliability. In addition, these systems usu- 
ally involve time-dependent functionality. Consequently, methods for behavior model- 
ing and verifying (especially temporal requirements) are increasingly important. The 
approaches most used for specifying timed systems are based on Timed Automata 
(TA). TA are well suited for expressing timed behavior and for modeling real-time 
components. A number of automatic verification tools for TA have been developed and 
have proven to be efficient e.g., Uppaal |[T3l and Kronos ll24l . Nevertheless, specify- 
ing and verifying time constraints are becoming more and more difficult tasks due to 
the widespread applications and increasing complexity of checked systems. Despite the 
different advantages proposed by TA, such as parallel composition, users often need to 
manually manipulate a set of clock variables with complex calculated clock constraints 
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in order to express the time properties. This process is tedious, error-prone and requires 
sophisticated logical and/or mathematical skills. 

On the other hand, in order to cope with the complexity of critical systems engineer- 
ing, approaches based on Model Driven Engineering (MDE) seem to be very useful 
lfT8l . The aim of this work is to introduce a new temporal requirement verification 
method based on MDE. First, we define a Pattern Basis for monitoring time constraints. 
Indeed, based on a new time constraint classification, we developed a set of time ob- 
servation patterns expressed in Unified Modeling Language (UML) State Machines 
(SM)[21 ]: this is expected to be a relatively inexpensive activity, since this procedure 
is done once and for all. UML has been chosen since it is relatively intuitive, offers 
a graphic description, is implemented by several tools and, finally, is a standard no- 
tation well supported by the Object Management Group (OMG). This set of patterns 
facilitates high-level system design. These patterns cover a large class of common time 
constraints. 

Since our aim is to keep a high precision level, a subsequent step consists in giving 
an accurate definition of each developed pattern. Hence, for each pattern, we give (1) 
a textual definition, (2) a UML SM model, (3) a structured English specification and, 
finally, (4) a temporal logic expression (Timed Computational Tree Logic (TCTL)) 
relative to the property concerned. 

Concretely, the verification process is based on the set of patterns. The suitable pat- 
terns corresponding to the time constraints extracted from the system requirements are 
picked up and instantiated. This instantiation step generates a set of SM Observers. The 
SM observers are translated into more formal notation, the TA, which provides support 
for the properties' verification. The translation is made according to a transformation 
algorithm that will be discussed later in the paper. In this way, analyzers exploit the ben- 
efits of formal notations without having to go through the complex and expensive for- 
mal modeling phase. This transformation generates a set of TA Observers. A system's 
model, which is also generated by the same transformation algorithm, is synchronized 
with the obtained TA observers to obtain a global model. Hence, the verification task is 
performed on this obtained model with a reachability analysis while checking whether 
the observers' forbidden states - corresponding to constraints violation - are reachable. 

The paper is organized as follows. In Section 2, we set out the context and we briefly 
go through some related works. Section 3 describes our first contribution by introducing 
the new time-constraints taxonomy and the patterns basis. The second contribution of 
our method is outlined in Section 4 where the translation from UML SM, with time 
annotations, to TA is described. The method is illustrated using a Level Crossing (LC) 
case study in Section 5. Section 6 concludes the paper while outlying some future work. 

2 Context and Background 

2.1 Related Work 

There have been many recent research efforts in the field of time-constrained system 
validation. Only two of these will be discussed in this section. First, based on the 
Dwyer [5] pattern basis, Dhaussy |3] defines a textual language, called "CDL", for 
requirement specification. The requirements are then translated into observer automata. 
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Furthermore, Dhaussy defines for each requirement a path, called "context" where the 
requirement should be checked. Finally, the system model, the observer automata and 
the context are translated into IF notation (Intermediate Format). Then, the verifica- 
tion is carried out using the IFx tool. Second, Nascimento 1 17] presents an approach 
for automatic generation of network of timed automata from a functional specification 
depicted via UML class and sequence diagrams. Nascimento uses UML sequence di- 
agrams for the property specification phase. However, sequence diagrams suffer from 
a limited expressiveness when dealing with temporal aspects, since they only depict 
order. 

Unlike the above methods, our approach uses TA as target notation; TA are assumed 
to be more expressive and well supported. 

On the other hand, several projects have introduced natural-language-based 
approaches where natural language is mapped into a more formal specification. Dwyer 
|5 ] proposes several patterns applicable to property specification expressed in differ- 
ent formalisms and logics such as LTL, CTL, GIL, and quantified regular expressions 
(QRE). Konrad [ 1 1 ] proposes an extension to Dwyer's classification and real-time prop- 
erties are added to the original classification. Moreover, TCTL, MTL and RTGIL are 
used to specify the added real-time properties. 

Compared to the above-mentioned works, our contribution offers the following 
advantages: 

- Our method takes advantage of the flexibility and expressiveness of UML SM in 
modeling tasks and the precision of TA formalism in the verification tasks. UML 
SM are also more expressive than UML sequence diagrams or UML collaboration 
diagrams used in other works, 

- TA are well supported, 

- Patterns facilitate high-level specification and promote reusability and knowledge 
capitalization, 

- The verification task is reduced to a reachability analysis, this allows us to over- 
come some limitations met with some existing tools, such as Uppaal. 

2.2 Observer Technique 

We use observers whenever we set artifacts to monitor system behavior (H. Let us 
recall here that the goal of our approach is to check whether the temporal requirements 
expected from a given system are satisfied. Hence, we make use of observers in order 
to express the satisfaction vs the violation of the predefined requirements [9 ]. Typically, 
checking a given temporal property consists in examining whether the error state of the 
corresponding observer is reachable. 

3 Observation Patterns 

In this section, we first propose a classification of all the common temporal require- 
ments one may meet when dealing with critical systems. Then, we develop a structured 
English grammar that we use to express the predefined properties. Next, we introduce 
the patterns used in order to monitor the predefined temporal requirements. Finally, a 
standardized description of these patterns is suggested. 
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Table 1. Temporal Requirement's Taxonomy Descriptions 



Class 


Category 


Pattern Name 


Description 


> 
I 



O 


QuantitativeAbsence 


Forbidden Before 


R ensures that an event (Emon) must never occur be- 
fore a minimum T oe f ore (time unit over En e f). S|=R 
is true if this event does not occur before T oe f ore . 


Forbidden After 


R ensures that an event (E mon ) must never occur after a 
deadline T a f ter (time unit over Er £ f). S|=R is true if 
this event does not occur after T a f ter . 


Forbidden Between 


R ensures that an event (Emon) must never occur within 
a temporal interval }t Begin', tEndi (over E^ e y). S|=R 
is true if this event does not occur between temporal in- 
terval }t Begin', t E ndl 


QuantitativePresence 


MinimumDelay 


R ensures that an event (Emon) must occur after a mini- 
mum time Train (time unit over Er £ f). S|=R is true if 
this event occurs after T m i n . 


MaximumDelay 


R ensures that an event (E mon ) must occur before a 
deadline T max (time unit over E^ e y). S|=R is true if 
this event occurs before Tmax. 


Punctuality 


R ensures that an event (E mon ) must occur at one punc- 
tual date t (time unit over E^ e j). S|=R is true if this 
event occurs at the t date. 


QuantitativeRecurrence 


UnlimitedRecurrence 


R ensures that an event (Emon) must occur infinitely. 
S|=R is true if this event occurs. 




QualitativeRecurrence 


LimitedRecurrence 


R ensures that an event (Emon) must occur k times. 
S|=R is true if this event occurs k times. 


QualitativePresence 


Presence After 


R ensures that an event (Emon) must occur after E^ e y 
has been detected. S|=R is true if this event occurs at 
least once after Er £ f . 


PresenceBefore 


R ensures that an event (Emon) must occur before 
ER e f. S|=R is false if E^ e j occurs before Emon- 


QualitativeAbsence 


Absence After 


R ensures that an event must never occur after En e f. 
S |=R is true if this event does not occur. 


AbsenceBefore 


R ensures that an event (Emon) must never occur be- 
fore Er g fV. S|=R is true if E m on does not occur before 
E Ref . 



3.1 Main Time- Constraints 

We strive to identify all the common temporal requirements one may meet when dealing 
with critical systems. The main identified requirements are defined and explained in 
TableQ](The relation which denotes that a system S satisfies a requirement R is written 
S|=R) and are also depicted in the shape of a UML Class diagram (Figure [T]). This 
classification offers the advantage that it deals with requirements on events only, since 
we express the requirements on states using two events: the first event represents the 
activation of the state and the second the deactivation. 



3.2 Structured English 

To facilitate the expression and the formalization of temporal properties, we have devel- 
oped a structured English grammar (SEG) |8|. This grammar supports both qualitative 
and quantitative properties. The aim of developing such a grammar is to provide the 
user with simple means for expressing temporal requirements. The SEG is somehow a 
guiding framework which offers expression means to the user. It also constrains him to 
make assertions according to a predefined syntax, hence making it possible to automati- 
cally recognize the expressed requirements. Concretely, each sentence generated by our 
grammar describes a temporal property and serves as a handler that aids expressing and 
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Fig. 1. Temporal Requirements Classification. 

understanding the requirement. For more details about the proposed SEG, the reader 
can refer to [8]. SEG is expressed below using BNF (Backus-Naur Form) notation: 

Property = {<Scope>: <Specification>}; 

Specification = {<Entity><Obligation>occur <Reference>} ; 

Scope = Global | Before <Entity>\ After <Entity>\ Between <Entity> and <Entity>; 

Entity = "Event" |"Active( State)" |"Desactive(State)"; 

Obligation = must | cannot; 

Reference = ((exactly at < time > over) | (After [a delay of < time > over]) |(<Before [a delay of 

< time > over] )) < Entity > ; 
Time = <Number>tu; 
Number = <Digit>+ ; 
Digit = {0|1|2|3|4|5|6|7|8|9}; 

Literal terminals are given in bold font, non-literal terminals are delimited by quotation 
marks (" ") and non-terminals are given in italics. The start symbol of the grammar is 
property and the language C of the grammar is finite, since the grammar is non-circular 
and has no repetitions. 



3.3 Observation Patterns Basis 

A pattern is a commonly reusable model in software systems that guarantees a set of 
characteristics and functionalities. The identification of a pattern is based on the context 
in which it is used. The goal behind developing patterns is to offer a support for system 
design and development. Using patterns helps in keeping design standardized and useful 
and minimizes reinventing in the design process, since the patterns facilitate reusability 
and knowledge capitalization [ 6 ] . 

In this work, we define a set of patterns which will serve as a basis to generate ob- 
servers for all the identified temporal requirements. The notation used is UML State 
Machines. The basis of patterns is introduced regardless of the systems' specification 
and is used to model all the common temporal requirement types that one may express. 
This pattern basis guarantees the reusability and the genericity of the mechanisms de- 
veloped within our approach. 
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3.4 Pattern Formalization 

We have introduced a new temporal requirement classification that is used in imple- 
menting our pattern repository. Additionally, we include a graphical representation of 
each pattern in the shape of a UML SM diagram [16]. This field will be used later as an 
input model for the model transformation phase. Each pattern in the repository contains 
the following fields: 

Pattern Name. The pattern name serves as a handle for the pattern's use and describes 

the type of the pattern. 
Pattern Definition. A short description and definition of the requirement for which the 

pattern is used. 
Scoped Structured English Specification. The scoped structured English sentence 

captures the invoked property using the grammar previously defined. The scope, 

initially introduced by Dwyer in 0, is used to express the applicability interval 

(scope) of the property. Four scopes are used in our grammar: globally, before an 

event occurs, after an event occurs and between two events. 
Temporal Logic Description. Contains mappings of the property monitored by the 

pattern to TCTL for each of the four defined scopes. We chose TCTL since it allows 

quantitative temporal property expression. 

4 Transformation Approach 

Since the observation pattern basis has been introduced, we will now discuss how to 
use this basis in the verification process. In practice, once the temporal requirements for 
the system under study are identified and extracted, the appropriate patterns for these 
requirements are selected and instantiated with the suitable parameters, thus resulting in 
some SM observers. Each SM observer monitors an elementary requirement. The SM 
observers are then translated into TA observers. This transformation will be presented 
hereafter. 

4.1 Transformation Idea 

In spite of the number of automated analyzers developed for TA, these tools suffer from 
two main limitations: the first is that users must be familiar with their formal nota- 
tions. The second is the lack of patterns for high-level system design (hierarchy notion 
namely). On the other hand, semi-formal languages, such as UML SM, are suitable for 
expressing system requirements. However, the automatic verification of these models 
is unfeasible directly. The temporal requirement verification approach that we propose 
takes advantage of the expression flexibility of SM and the analysis facilities offered by 
TA formalism. 

The various rules of the transformation algorithm we have defined are expressed 
according to the Model-Driven Architecture (MDA) approach. MDA is an initiative and 
a standard proposed by the OMG, allowing developers to create systems entirely based 
on models. It points out the idea of separation of concerns by unlinking/uncoupling the 
application logic from the implementation platforms technology [23] . 
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Fig. 2. Transformation Approach. 



Figure[2]illustrates the use of the MDA four-layer metamodeling architecture for our 
transformation; 

- The source model (resp. target model) is expressed according to the source meta- 
model (resp. target metamodel), 

- The metamodels are defined and expressed according to the MOF metametamodel 
(in our transformation, we used the Ecore metametamodel, which is the Eclipse 
implementation of MOF), 

- A metamodel is developed for TA. On the other side, we used the UML metamodel 
distributed in the Eclipse framework, 

- All the rules are introduced at the metamodel level, 

- The transformation takes a UML SM model as a source model and generates a TA 
model with a corresponding formatted code. 

4.2 Time Annotations 

Here we use SM as a modeling notation to take advantage of the flexibility they offer. 
However, since we strive to obtain an accurate specification, we should guide the user 
while introducing the temporal constraints. Concretely, we propose a set of timed an- 
notations in order to express the states' characteristics as well as the transition guards. 
Table [2] shows some examples of them and defines the signification of each annotation. 



4.3 Transformation Algorithm 

One of the key parts of our method is the translation of UML SM with time annotations 
into TA. For the sake of space, we will briefly describe the transformation rules while 
giving the source and target element for each of them in[3] For more details, the reader 
can refer to fl5l . 

The main rule of this algorithm is the FromStateMachine rule. This rule is the first 
one carried out by the transformation algorithm. It picks elements in the source model, 
then calls on other rules to translate the selected elements into TA elements in the target 
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Table 2. Time annotations used. 





Time annotation 


Signification 


1 


at_most(T max ) 


t_^ J- max 


2 


at_least(T min ) 


t^ -I- min 


3 


after(d) 


t=d 


4 


between(T min , T m ax) 


-t min _;t\ _/ max 


5 


upper(T min ) 


t> -t min 


6 


lower(T ma x) 


t\ -t max 



Table 3. Transformation Rules. 



Rule Name 


Source element: UML SM 


Target element: TA 


FromStateMachine 


StateMachine 


TA 


Simple2Simple 


State 


State 


Final2State 


Final Pseudostate 


State 


OR2Automata 


State 


Automaton 


AND2Automata 


State 


Automaton 


Trans2Trans 


Transition 


Transition 


Entry2State 


EntryAction 


State 


Exit2State 


ExitAction 


State 


Do2State 


DoActivity 


State 



model. Likewise, the rules called on behave in the same way; they select elements in 
the source model and call on the appropriate rule to transform them. For example, the 
FromStateMachine rule is applied to elements of the "UML:: StateMachine" type and 
translates them into a "TA::AutomataMachine" element. Also, different element types 
are selected and different rules are called on in this rule. First, the rule selects all the 
UML states. Then for each selected state, according to its type, one or more of the 
Simple2 Simple or OR2Automata or AND2Automata rules are called on. Secondly, it 
translates the "UML: transition" elements by invoking the Trans2Trans rule. Also, this 
rule deals with another element type, the "UML:: Pseudostate", by invoking some other 
rules, such as Final2 State. 

This internal transformation process is the same for all the rules; each rule transforms 
the source element into the target one. Then, it selects subelements of the source element 
and calls on the appropriate rule to transform them. 

4.4 Verification Process 



Once our observation patterns' repository is implemented, we introduce a verification 
process based on it. This section will outline the global architecture of our approach. 
The architecture is depicted graphically in Figure[3l lfT4l . 

Concretely, our approach is composed of four processes: first, temporal requirements 
for the system under study are identified and extracted. Second, the appropriate pat- 
terns for these extracted requirements are selected and instantiated with the suitable 
parameters (This second process results in some SM observers. Each SM observer 
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Fig. 3. Method Global View. 

corresponds to an elementary requirement). Third, the SM observers are translated into 
TA observers. In parallel and in the same way, the specification under study (SM model 
with time annotations) is abstracted and translated into a TA model. The translation 
from the UML SM to TA is performed using the MDA model transformation tech- 
nique, as shown previously. Finally, the generated TA are synchronized with the formal 
system's specification model (TA) to generate a global model holding both the system 
specification and the requiements' monitoring. Thus, the verification task is reduced to 
an error- state reachability search on the obtained global model. 



5 Case Study 

5.1 Case Study Description 

A classical automatic level crossing system is composed of several modules. The local 
control system which manages the traffic in the crossing area, a pair of barriers (gate), 
traffic lights whose role is to alert and prevent road users from entering the crossing 
zone, and a train-sensing module which monitors trains approaching/leaving Q. The 
subsystems mentioned above execute in parallel and synchronize through events. 

Several requirements |25 ] are given in textual specification in Figured Next, using 
this textual specification, we will show how patterns are used to express and monitor 
requirements. 
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". . . As long as an approaching train runs over the activation sensor the sensor shall 
generate an occupied signal. When the last axle of the train has traversed the 
activation sensor it shall generate a free signal again. If the control unit receives an 
occupied signal from the activation sensor in the unsaved mode, it will enter the saving 
mode and gives the command to turn on the yellow lamp of the set of lights. Three 
seconds after entering the saving mode the control unit has to give the command for 
switching off the yellow lamp and turning on the red lamp. . . " 
". . . By entering saved mode the controller shall switch on the supervision signal to 
show signal aspect LCI, which means to turn on the blinking light. Twelve seconds 
after the system has entered the safed mode it must start to lower the gate. The 
activity of lowering or raising the gate must not last longer than six seconds from 
one end position to the other. When the gate has reached the lower end position within 
the six seconds interval the control unit will enter the mode saved and gates closed. . . " 



Fig. 4. Specification Example [25]. 



Punctuality Pattern 

Pattern Name: Punctuality 

Textual Description: The Punctuality pattern is used to check that a given event execution 

is delayed by exactly d time units relative to a given reference. 
UML SM Diagram: 



►T Initial 1 — entering 



Punctuality 
SM Observer 



upper(3) 



^P\ 



switch ? 
" after(3) 



\ 






KO 



switch ? 
~lowet(3) 



OK 



Structured English: Scope "switch" must occur exactly at "3" tu over "entering". 

Temporal Logic description: 

Globally: AG[receive(entering) =>■ AF =c z receive(switch)] 

After E a : AG(E a ^(EG[receive(entering) =>■ AF =c z receive(s witch)])) 

Before E&: A((AG[receive(entering) =>■ AF =£ j receive(s witch)]) U E&) 

Between E b and E e : 

AG((E 6 A--E e )-KP) U E e ) 

where P=AG[receive(entering) => AF = d receive(switch)] 



Fig. 5. Punctuality Pattern. 
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5.2 Using Patterns 



For the sake of space, only two requirements will be checked on the basis of the tex- 
tual system specification. For each requirement, formalization is introduced using the 
defined generic template. First, a textual description of the requirement is presented, 
followed by an intuitive graphical representation in the shape of a UML SM (SM pat- 
terns). Then, a definition using our structured English grammar is given and, finally, a 
temporal logic expression is used to express the requirement formally. 

1) The first requirement states that, three seconds after receiving the entering signal, 
the controller should send a command for switching lights. This requirement consists 
in a punctuality property (Figure [5} of 3 seconds between the entering signal and the 
switch signal. 

2) The second requirement states that the command signal closed should be detected 
at most 19 seconds after the entering signal. This requirement consists in maximumde- 
lay property (Figure [6]) of 19 time units between entering and closed signals. 



MaximumDelay Pattern 

Pattern Name: MaximumDelay 

Textual Description: The MaximumDelay pattern is used to check if event execution is 

delayed by at most a given value £ time units from a given reference. This value £ 

should be in [0, T ma x] interval. 
UML SM Diagram: 



#— » ( Initial ] — entering ? 

, Ohr 



MaximumDelay 
SM Observer 



upper(19) 



closed ? 
at_most(19) 



*\. 



KO 






OK 



Structured English: Scope "closed" must occur Before a delay of "19" tu over 

"entering". 
Temporal Logic description: 

Globally: AG[receive(entering) => ^closed U<iq receive(closed)] 
After E a : AG(E a ^(AG[receive(entering) => (^closed U<ig receive(closed))])) 
Before E^: A((AG[receive(entering) => (^closed U<iq receive(closed))]) U E&) 
Between E^ and E e : 

AG((E 6 A--E e )-KP) U E e ) 

where P=AG[receive(entering) => (^closed U<iq receive(closed))] 



Fig. 6. MaximumDelay Pattern. 
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5.3 Verification 

The above step consists in instantiating the appropriate patterns in order to obtain ob- 
servers for the extracted requirements. These observers are then translated into TA mod- 
els to be synchronized with the system specification. The verification process consists 
in examining the reachability of the KO-statcs within the observers. The verification of 
our case study is carried out using the UPPAAL model checker. 

6 Conclusions 

In this paper, we have presented a model-based method applicable to formal specifica- 
tion and validation of time-constrained systems. The approach uses a set of observation 
patterns that we have established and which act as watch-dogs for the defined temporal 
requirements. Each pattern has been defined using a standard template we developed. 
Using patterns offers genericity and reusability. 

On the other hand, we have developed a transformation algorithm to translate SM 
with time annotations into TA, the aim being to make a basis for the verification pro- 
cess. For this purpose, we have introduced a TA metamodel using an extended defini- 
tion of the original TA definition given by 1 1 1 (for reasons of brevity, the TA metamodel 
description is omitted in this paper). Once the relationships (transformations rules) be- 
tween UML SM metamodel elements and TA metamodel elements were defined, we 
expressed them in the QVT (Query/ View/ Transformation) language, defined by the 
OMG as the standard for the transformation phase. Then we used QVTo -an Eclipse 
Plugin- to run the algorithm. 

Processing the verification upon the TA observers synchronized with the system 
specification reduces the verification task to a reachability analysis of the KO-nodes 
within the observers. 

One of the main issues that we are focusing on currently is the validation of the 
model transformation algorithm that we have developed since it is a key in ensuring the 
correctness of our approach. There are several techniques that aim at validating model 
transformation (2), ifTOlL (El, (T9lL (20lL l22l . In our case, we elaborate the underlying 
transition system for both UML SM with time annotations and TA, and we establish 
equivalence relation (temporal bisimulation) between the obtained transition systems 
(TO), (T9l . Despite that this technique is the most time-consumer over other techniques, 
it remains that it is the best proof to guarantee semantic preserving (fill the semantic 
gap) through the transformation since it is mathematical-based. 

Based on the structured English grammar we have developed, a prototype tool which 
offers interesting facilities in terms of requirement specification and requirement 
consistency-check has been implemented [8]. A subsequent step will be to extend this 
tool with a new module which automatically instantiates observers for the entered re- 
quirements using the observation patterns repository. 
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Abstract. Surveillance and convoy tracking applications often require groups 
of networked agents for redundancy and better coverage. Important goals upon 
deployment include the establishment of a formation around a target and syn- 
chronization of the output (e.g., velocity). Although there exist distributed algo- 
rithms using only local communication that achieve these goals, they typically 
ignore destabilizing effects resulting from implementation uncertainties, such as 
network delays and data loss. This paper resolves these issues by introducing a 
discrete-time distributed design framework that uses a compositional, passivity- 
based approach to ensure /™ -stability regardless of overlay network topology, in 
the presence of network delays and data loss. For the restricted case of a reg- 
ular overlay network topology, this work shows that asymptotic formation es- 
tablishment and output synchronization can be achieved. Finally, simulations of 
velocity-limited quadrotor unmanned air vehicles (UAVs) are presented to show 
the performance in the presence of time-varying network delays and varying 
amounts of data loss. 

Keywords: Passivity, Compositionality, Deployment, Input-output stability, 
Overlay network, Output synchronization. 



1 Introduction 

Modern surveillance and convoy tracking applications often require deploying groups 
of unmanned aerial vehicles (UAVs). The benefit of using multiple UAVs is redun- 
dancy, which reduces the likelihood of missing interesting events on the ground, in the 
presence of obstructions caused by nonuniform terrain, vegetation, or man-made struc- 
tures. Further, the additional UAVs provide greater breadth of coverage. A central task 
for such multi-agent systems is to establish a formation around an area of interest. For 
example, an n-gon with a target as its center, at the appropriate radius, may simulta- 
neously provide significant redundancy and breadth of coverage. Another problem of 
interest when controlling autonomous UAVs is to synchronize a subset of the outputs. 
For example, consider flying the UAVs in a desired formation with identical speed and 
heading. This problem is known as the output synchronization problem. 

Performing coordinated tasks in multi-agent systems using only local information 
has been studied extensively over the past decade lfT6l , ifTTlL iTTSll . Typically, in group 
coordination the desired formation emerges from the design of the control law. In [4|, 
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the so-called information filter is used for formation stability of LTI systems. For co- 
ordination of nonlinear systems, contraction theory with wave variable communication 
[20], explicit design of Lyapunov vector fields |[T2lL and passivity Q, (HJ, Q, and (7), 
have been used successfully. 

Much of the above work — especially the passivity-based methods — has considered 
continuous-time systems; however, for implementation discrete-time design is needed. 
In addition, implementation uncertainties such as network delays and data loss must 
be taken into consideration. This paper focuses on decoupling the control design and 
discrete-time implementation by using a passivity-based framework inspired by work 
in telemanipulation [3], port-Hamiltonian systems 1 19], and network control 1 11]. 

In this work, we introduce a compositional network control system (NCS) design ap- 
proach that guarantees passivity of the networked system. We show that a multi-agent 
network designed within the constraints of this framework is /™ -stable for any bidi- 
rectional overlay network with asymmetric delays whenever the input-output mapping 
of each agent is strictly-output passive. The stability result holds for packet- switched 
networks using easily enforced constraints. For the single-input, single-output (SISO) 
case, we perform steady-state analysis to show that the multi-agent network can achieve 
output synchronization or establish an n-gon upon deployment under appropriate as- 
sumptions. Finally, we provide simulations of velocity-limited quadrotor UAVs illus- 
trating how the multi-agent network performs in establishing an n-gon in the presence 
of time- varying network delays and varying amounts of data loss. 

The rest of the paper is organized as follows: Sect. [2| defines the formation estab- 
lishment and output synchronization problems and provides background material. The 
distributed NCS design framework is described in Sect. [3] The theoretical results are 
detailed in Sect. (4) Section \5\ presents simulations in Simulink/TrueTime. Concluding 
remarks are given in Sect. [6] 

2 Preliminaries 

In this paper we consider two problems: formation establishment and output synchro- 
nization. First, consider the problem of n agents establishing a formation around a 
target in IR 2 . Assume a global inertial coordinate system and suppose the starting 
positions of the agents are arbitrary. The goal is to establish an n-gon, where the n 
agents tend to the coordinates of the vertices asymptotically. Formally, we assign a ver- 
tex V{ of the n-gon to agent z, with position y%(k\ i = 1, 2, . . . , n. Then we require 
lim/e^oo \\yi(k) - ^|| 2 = 0. 

For the problem of output synchronization, the outputs yi and yj of any two agents i 
and j (i ^ j) must satisfy lim/^oo \\yi{k) - yj(k)\\ 2 = 0. 

For these problems, we consider a network of n interacting agents with commu- 
nication topology described by a connected undirected graph, G = (V,E), where 
V r = {l,2,...,n} describes the agents and E cV xV models the bidirectional com- 
munication. Additionally, each bidirectional link may have asymmetric, time- varying 
delays. The delays are denoted dij (k) for link (i, j) 6 E. 

For the purpose of analysis, it is useful to introduce the adjacency matrix, A = [a^], 
associated with graph G 0. For an undirected graph, the adjacency matrix is a sym- 
metric matrix (i.e., A = A T ). An entry a^ is equal to one if the edge (i, j) € E and 



A Passivity-Based Approach to Group Coordination in Multi-agent Networks 137 

zero otherwise. Additionally, we define the set of neighbors, Ni, of a node i as those 
nodes which send messages to z, given by Ni = { j G V \ a^ ^ 0}. Finally, we denote 
the number of neighbors by | Ni\ = m. 

The agents communicate and process signals in the extended I2 -space of functions 
that maps Z + to R m , denoted l™ e . These signals are mapped onto I™ by the TV-truncation 
operator (-)n '• l™ e — > I™, which simply nullifies the function values for indices strictly 
larger than N — 1. Further, for all /, g G l™ e define (/, #)at — ^k=o / T (^)#(^) anc ^ 
II (/)jv II 2 — (fif)N- We use definitions for /^-stability and passivity for discrete-time 
systems, which are analogous to the continuous-time counterparts in (T8J: 

Definition 1. Given a discrete-time system with input-output mapping, H : l™ e — > l^ e , 
the discrete-time system is /^-stable {fuel™ =>• H(u) G I™. 

Definition 2. LetH: lf e -> lf e . Then, for all ue lf e : 

1. H is passive if there exists some constant /3 G M (the bias) such that (H(u), u)n > 

-AV7VGIN; 

2. i^ is strictly output passive if there exists some constants (3 G R a/iJ e > ^wc/z 
rAar (ff(w), ^)at > e||(ff(w))j>r||l - A V7V G IN. 

We assume a synchronous network, with period T\}j Further, each agent shares infor- 
mation only locally (no global shared resources). However, the desired setpoints are 
calculated prior to deployment. Finally, the agents begin execution at time index k = 0. 

3 NCS Design 

This section details the distributed network control system (NCS) design. The objective 
is to provide a passive-by-construction, discrete-time multi-agent network. In general, 
the overlay network is bidirectional with asymmetric delays. For simplicity, consider 
the three node network shown in Fig.[T^. Each node represents a quadrotor UAV, with 
each edge modeling the communication between UAVs. Realistically, each link in the 
network is subject to delay imposed by packet handling and transmission delays. This 
is modeled by the time- varying delays (e.g., d^-(fc)), shown in Fig. [T^. The u and v 
variables in the figure are power wave variables, which are described in Sect. 13.21 



3.1 Agent Model 

The agent model is shown in Fig.QJ). Each agent i receives an input reference, r^, which 
influences the output, y^ of the agent through the system mapping, B.{. The mapping 
Hi describes a compensated plant, and is required to be strictly output passive. The 
variables X{ and yi are transformed into the wave domain through the scattering trans- 
formation. The node's wave variables uu and vu are coupled to other nodes through a 
power junction, PJi, which allows two or more systems to be connected in a passivity- 
preserving manner [11]. The scattering transformation and power junction are crucial 
to ensuring passivity of the networked system and will be described in the next section. 



1 We assume the agents use a clock synchronization algorithm prior to deployment to ensure 
this assumption holds. 
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(a) A three node network. 



(b) Node structure. 



Fig. 1. Network and node structure. 

For simplicity, we model the quadrotor UAVs as a point mass in two dimensions. We 
denote the point mass system, H p : fj — ► yi, in which // G IR 2 is the control force 
and yi G IR 2 is the position as depicted in Fig. [2^. The equations of motion are 

y I (t) = v I (t), Mv I (t) = f I (t). 

Using the point mass model for each agent i, we design a position control system, which 
we denote Hi : e\ — ► yi, shown in Fig. [2^. The inner loop gain of the compensator is 
uj c M (ou c > 0) and the outer loop gain ^f . The overall equation of motion 



Vi 



-v c yi 



-(yi - ei) = -2C,uj n yi - uj 2 n (y! - e { ) , 



clearly indicates a stable second order system with natural frequency u n = ■% and 
damping coefficient ( = 4=, where yj = e\ at steady-state. It can be shown that the 
position control system is inside the sector [a, 1], where a — — ]_ r- |2T1L |[T0l . 
Therefore, the system Hi : e^ —> yi is not strictly output passive; however, by adding 
a high-pass filter in parallel, the system may be rendered strictly output passive, as 
depicted in Fig. [2J3 (with c = 2). Since e^ = yi = yi at steady-state, the position 
of the system may be directly controlled. This model is discretized using a bilinear- 
like transform, called the inner-product equivalent sample and hold (IPESH) transform, 
which preserves the passive properties of the system fiTTl . 



3.2 Network Model 

In distributed control applications the information transmitted across the network has 
inherent physical meaning. It is well known that transforming these physical variables 
into the wave domain can preserve passivity and stability for a single bidirectional con- 
nection [3] and for star networks ATI . In this paper, we extend these discrete-time 
approaches to distributed networks with arbitrary overlay topology. The network model 
is distributed in the sense that all nodes in the network communicate only locally. 
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(b) Strictly output passive UAV model. 
Fig. 2. Simplified quadrotor UAV model. 



We formally define the scattering transformation as follows. For each i E V, the 
scattering transformation produces power waves uu(k) and vu(k) defined by 






(la) 
(lb) 



This definition is similar to the one in lfT4l . with the force and velocity variables re- 
placed with Xi and yi. Traditionally, the scattering formalism has been applied to power 
variables (effort and flow) while closing the loop on velocity. In this work, the scattering 
formalism is used abstractly (without the physical interpretation) to close the loop on 
position. The scattering transformation is treated as a mathematical definition, where 
the characteristic impedance, bi, has the appropriate units for physical consistency (in 
this case bi is unitless). 

Next, we define the power junction, which allows two or more systems to be con- 
nected in the wave domain in a passivity-preserving manner. 

Definition 3. Fix m,p G IN,p > 2. Then, a power junction is a function f: l™ e p — > 
l™ e p , which satisfies for all £ G l™ e p and all k G 7Z^ the inequality £ T (fc)£(fc) > 

fm)) J fm))- 



The vector £(fc) in the definition of the power junction is formed by concatenating the 
p inputs in l™ e into a single mp-dimensional column vector. For analyzing our network 
model, it is useful to pair the p inputs to their corresponding outputs in the output 
column vector, f(£(k)), and partition the set of pairs into two disjoint sets Si n and S out . 
These sets denote the net flow of power into and out of the power junction, respectively. 
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Formally, for i G Si n and o G S ou u let u^ v G l™ e denote the inputs and Vi, u G l™ e 
denote the outputs of the power junction. Then the inequality in Def.[3]may be rewritten 
as 

53 uJ(k) Ui (k) - vJ(k) Vi (k) > ]T <( k ) u o{k) ~ vj(k)v (k) . (2) 

i^Sin oeSout 

We implement each node's power junction as a linear set of equations. Specifically, we 
use the following equations. For each i eV, j £ Ni, and k G ^ + , the outgoing waves 
are computed as 

Uij(k) = -±=u u (k) , (3a) 

jem 

Although the functional form of the power junction is not constrained to be linear, these 
equations simplify the steady-state analysis and exhibit a local averaging behavior in 
regular networks. This can be seen as follows. Consider the wave variables that influ- 
ence the power junction at a given node i and suppose rii = rij = rj, Vi, j G V (i.e., a 
regular network). Then, for each j G N{ , Vji (k) = Uji(k — dji(k)). Thus, an expression 
for vu(k) is given by 

vu(k) = -±= J2 v Ji( k ) = ^Yl u ^ k " d i*( fc )) = Tj J2 u ^ k " d J*( k )) ' 

jeNi j£Ni j£Ni 

Therefore, in regular networks, the input wave variable, vu(k), is the average of its 
neighbors' delayed output wave variables, Ujj(k — dji(k)), j £ N{. 

Due to the presence of delays and data loss, some (or all) of the Vji(k) may not be 
received at time k, in which case Vji(k) = 0. Handling delayed and dropped packets 
as null packets satisfies the synchronous assumption and preserves passivity |3 1. Before 
proceeding to describe the constraints on delayed and lost data, we prove our claim that 
the implementation given by ^ satisfies the definition of a power junction. 

Lemma 1. The implementation defined by @ satisfies the power junction constraint. 
Proof. From the remarks following the power junction definition, we must show 

ul(k)uii{k) - vl(k)vu(k) > ^ u Jj( k >ij( k ) ~ vjiityvjiik) . (4) 

jem 

Clearly, a sufficient condition for satisfying ([4]) is to enforce the following constraints 
for each component I = 1, 2, . . . , m, 

J2 <(k) < <(*) andt&(fc) < J2 v l,\ k ) ■ 

jem jem 



The first inequality holds with equality. Indeed, by d3aL 

jem jGNi 
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Fig. 3. A point-to-point connection using the scattering formalism to ensure passivity of the bidi- 
rectional connection subject to asymmetric time- varying delays, shown inside the dashed box. 

For the second inequality, combine d3bt with the Cauchy-Schwarz inequality to get 



Finally, we constrain the network model by preventing retransmission of data for each 
agent. Also, as mentioned above, whenever receiver's buffers are empty, we process null 
packets. Based on these assumptions, each channel (i,j) G E satisfies the following 
inequality regardless of time- varying delays and data loss 0, 



(^ 3 )n\\1<\\(u 13 ) n \\1 holds V7V GIN. 



(5) 



This inequality states that each channel, viewed as the input-output mapping shown in 
Fig. is passive. 

4 Analysis 

4.1 Passivity of the Networked System 

In this section we first prove that the network model is passive and then show that 
the input-output mapping describing the networked system is strictly output passive. 
Figured shows the passive network. The following lemma proves that the portion inside 
the dashed box of Fig.|4]is passive. 

Lemma 2. Consider a network of n interacting dynamic systems constrained to the 
design framework described in Sect.\3\ Then, the global energy constraint 



i=l 



(uu)n\\1 



(vn) N \\l) >0 



(6) 



is satisfied for all N E IN, regardless of time -varying delays and data loss. 



Proof Sum the power constraints ([4]) of each node i from time k = 0tok = N — 1 
and then sum the resulting inequalities over all nodes (swap the position of Vji and Vij 
in the sum of sums). Then, invoke © to obtain 



£(IK U -Vll2 - ll(^)ivlll) > E E (IIKMI 



ll(%-)iv||i)>o. 



i=l 



i=i jeNi 



□ 
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Fig. 4. A three node network illustrating the passive network, shown inside the dashed box. 



The energy constraint of Lemma [2] also implies that collectively, the mapping from the 
Hi to the Xi, i = 1, . . . , n, is passive (see Fig.QJ)). To show this, consider the following 
power constraint, which may easily be derived from (fTat and dTbt 



±(ul(k)uu(k) - vl(k)vu(k)) = yj(k)xi(k) . 
Substitute (7]) into the global energy constraint of Lemma[2]to obtain 



(7) 



2=1 



{Vi,Xi)N > . 



(8) 



Define x(k) and y (k) as the nm x 1 column vectors formed by concatenating the Xi(k) 
and yi(k), respectively, of each node. Then, it follows that (y, x)n > 0, which satisfies 
the definition of passivity in Def.O with (3 = 0. 

We conclude the section by proving that the entire networked system (e.g., the three 
node system in Fig. [4]) is strictly output passive for arbitrary network topologies. 

Theorem 1. Consider a network of ' n interacting dynamic systems constrained to the 
design framework described in Sect. \3\ Define r(k) and y(k) as the nm x 1 column 
vectors formed by concatenating the ri (k) and yi(k), respectively, of each node. Finally, 
define the input-output mapping H : 1%™ — > V^™ such that H(r(k)) = y(k). Then, H 
is strictly output passive. 

Proof Since each Hi is strictly output passive, there exists ei > and (3i, for all i E V, 
such that 

{Vi,ei) N > ei\\(yi) N \\l -pi. (9) 



Making the substitution, Xi(k) 
inner-product, gives 



ri(k) — ei{k) into ® and using the linearity of the 

n n 

^2(Vh r i)N > ^2(Vi, ei) N . (10) 
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Substituting © into (HOj) yields Y!i=i{Vii r i)N > ^YJi=i IKz/tKlli _ P> where e = 
min^{ei} and (3 = J27=i Pi- This can be rewritten as (y, t)n > 6 ll(2/)iv||2 ~~ 0- a 

4.2 Stability 

The previous result shows that the networked system defined by the mapping H is 
strictly output passive. It then follows that H is /^-stable. 

Theorem 2. The mapping H(r(k)) = y(k) defined in Theorem\]]is Instable. 

Proof. We begin with the notion of finite /^-gain. The map G has finite /^-gain if there 
exists finite constants 7,/? such that for all N e IN, \\(G(u)) N \\ 2 < j\\(u) N \\ 2 + /?, 
\/uelf e . 

It is well known in continuous-time 1 18] and has been shown for discrete-time (5) 
that a sufficient condition for a system to have finite /^-gain is for the system to be 
strictly output passive. Therefore, by TheoremQ] H has finite /^-gain. 

Now suppose u £ I™ (i.e., \\u\\2 < 00). Then take N — > 00 in the definition of finite 
/^-gain . This leads to ||G(u)|| 2 < 7H2 + P < 00, \/u e If. Therefore, H(u) e If. 
By Definition[T] H is /^-stable. " D 

From the proof of Theorem O we see that any system that is strictly output passive 
is necessarily /^-stable. Therefore, each agent described by Hi is inherently stable. 
The benefit of the passivity-based network framework is that it ensures that interactions 
caused by the network do not destabilize the networked multi-agent system. This result 
holds even in the presence of time-varying delays and data loss (under the assumptions 
outlined in Sect. l3.2b because the passivity results hold. Moreover, the networked multi- 
agent system will remain stable regardless of network topology. 



4.3 Steady-State Analysis, n-gon Establishment, and Output Synchronization 

To analyze the behavior of the coupled multi-agent system, we consider the system at 
steady-state. Of course, the system will not reach steady-state in the presence of time- 
varying delays and packet loss. However, the system will approach the ideal steady- 
state case for constant reference inputs even with time-varying delays and data loss 
because by Theorem[T] the system is strictly output passive. For simplicity, we assume 
the system is SISO. If the degrees of freedom of the system are decoupled, this result 
may be applied to MIMO systems. 

Theorem 3. Consider a network of n SISO agents designed using the framework de- 
scribed in Sect.\3\and assume no time delays or data loss. Assume that constant inputs, 
T{, result in steady -state gains gi, for i = 1, 2, . . . , n. Then the steady -state output of 
node i is given by 



^^l^ + ^E^O^+r,) . (id 
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Proof. Since time delays and data loss are ignored, we drop the time index. Using the 
relation ei = r\ — x\ and replacing Hi with g^ the input-output relation yi = Hi{ei) 
may be written as yi = giivi — Xi). First, substitute this into dTbt and solve for xi. Then 
substitute this value of Xi back into yi = Qiivi —Xi)\o get 

Vi = T^n + ^v ii . (12) 

Combining Vji = Uji with ([3ab at node j (roles of j and i are reversed), produces 
Vji = -f=Ujj. Substituting this into (f3bb for node i yields 

^ = ^=E ~h u n - ( 13 ) 

Now, solving yj = gj (rj — Xj) for Xj and substituting into dTab at node j produces 

^^(^i+'i)- (14) 

Substitute d into fl3 to get v u = -^ Y, jeNi ~t== (^^Vj + ^)- Fin ally, 
substitute this into (fT2l) to obtain (TTTb . D 

Theorem [3] provides a system of n equations describing the system asymptotically (as 
k — > oo). The system of equations described by (TTTb are clearly coupled and depend on 
the overlay network structure. For the case of a regular network, the following corollary 
characterizes the system of equations and provides the means to precalculate the refer- 
ence inputs to asymptotically achieve a desired setpoint. For the two-dimensional agent 
model described in Sect. 13. li the two degrees of freedom are decoupled, so we use this 
corollary to establish an n-gon around the target, as described in Sect. [2] 

Corollary 1. Consider a network of n SISO agents with a regular overlay network 
topology (i.e., ni — rij = r] Vz, j G V). If all of the systems Hi have identical steady - 
state gain g and each scattering transformation has the same impedance b, the system 
of steady-state equations may be written as 



y 



^( r + H^H) ' (15) 



where y and r are defined in Theorem\J\and A is the adjacency matrix of the regular 
overlay network topology. Assuming the inverse of (r\I + A ) exists, we may solve this 
equation for r to obtain 

r = ifal + A)- 1 ((bg + 1)7?/ - (bg - 1)A) y . (16) 

Next, we show that the output synchronization problem is achieved under the conditions 
in which the steady- state values are achieved in regular networks. In the presence of 
time-varying delays and packet loss the outputs will approach synchronization, but will 
not synchronize unless packet loss ceases and the delays become fixed and remain fixed. 
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Theorem 4. Assume the hypotheses ofTheorem\3\ but with r\ = p, g\ = g, <2ftd 6^ = b. 
Provided bg > a/id f/ie network has a regular network topology with ni = r] > 0, the 
output synchronization problem is achieved. 

Proof. For the case of common reference input, p, the system of equations given by 
03]) may be solved for y as follows 

y = g (ibg + 1)/ - b -^A) ~* (/ + ±A)pl 4 gB^B 2 pl 4 Spl , (17) 

where 1 is the column vector of ones, £>i = (6g + 1)7 2—^- A, B2 — I + -A, and 

£> = gB^ 1 I>2. Therefore, we have to show that the matrix 5 in (fTD) exists and has 
a right eigenvector of 1 corresponding to eigenvalue g. Then the asymptotic outputs 
y will have an aligned equilibrium point of gpl, and thus output synchronization is 
achieved. 

For existence, we must show that the matrix inverse, B± , exists. Using Gersgorin's 
Disc Theorem, we show that all eigenvalues of B\ are positive, so that B\ is positive 
definite and therefore invertible 1 6 1 . Since the overlay network graph is 77-regular, the 
sum of the absolute values of terms in each row i excluding the diagonal term (B\ ..) is 
\bg — 1|. The diagonal term of each row is bg + 1. Therefore, the Gersgorin discs are all 
equal, and centered at bg + 1 with radius \bg — l\. Since bg > 0, the entirety of all of the 
discs are positive, so by Gersgorin's Disc Theorem, all eigenvalues of B\ are positive. 

Next, we show that B has right eigenvector 1 corresponding to eigenvalue g (for 
brevity we drop the reference to 1). First, since each node has degree rj, A has eigenvalue 
77. Hence, \b 1 = A# 2 = 2. Therefore, since X B -i = ^— , B has eigenvalue g. □ 



5 Simulations 

The experimental setup involves a network of eight velocity-limited quadrotor UAVs 
that communicate in a regular overlay network topology, each with degree 77 = 4, and 
a synchronous sampling period of T = 0.01 seconds. The UAVs move in the plane, 
and the goal is to form an octagon with eah UAV 100m from a target centered at the 
origin. The initial points of the UAVs are randomly selected within the region between 
a 1000m square and the circle with 100m radius, both centered at the origin. We model 
the UAVs as described in Section 13.11 but with saturation in the actuators to limit the 
velocity. The parameters in Corollary Q] used to precalculate the reference values are: 
g = 1 and 6 = 1. The dynamics of the agents are implemented in Simulink, while 
TrueTime is used to simulate the network dynamics and communication between UAVs. 
The network protocol used is IEEE 802.1 lb, with a speed of 1 1 Mbps. 

5.1 Evaluation 

In 021, we evalutated the performance of the networked UAVs in establishing an oc- 
tagon of radius 100m, under various network conditions. In particular, we consid- 
ered: no delays nor data loss (nominal case), nonuniform constant delays, time- varying 
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delays, data loss, and combined effects (time- varying delays and data loss). In this pa- 
per, we demonstrate the performance in establishing the octagon in the case where there 
are time- varying delays and varying amounts of data loss. 

Scenario 1: No data loss. In [13], we only studied the effects of time-varying delays 
with no additional delay bias, which implies that there could be time steps with zero 
delay. Here, we consider the more realistic scenario, where the time- varying delays have 
additional nonuniform constant delay bias in all the communication channels of the 
network. The delay biases are between 1 and 2 seconds. As was done in [ 13 ], to simulate 
the time- varying delays, we introduce a disturbance node in the network. The sampling 
period of the disturbance node is 0.05 seconds, and the disturbance node floods the 
network with disturbance packets based on a Bernoulli process with parameter d (in 
this case, d = 0.5). The disturbance node samples a uniformly distributed random 
variable X[k] G [0, 1] every 0.05 seconds. If X[k] > d, a disturbance packet is forced 
on the network. Figure |5(a)| shows the average and maximum errors for the nominal 
case and the combined delays with no data loss. The results show that in the presence 
of combined delays, the UAVs remain stable and settle to the desired configuration. 

Scenario 2: 5% data loss. In this scenario, 5% packet loss is introduced with the com- 
bined delays described above. A probabilistic model is used to implement the loss of 
packets in the channels. The results for this case are shown in Fig. |5(b)| The results 
illustrate that with packet loss the system does not reach steady-state, and therefore, 
does not achieve zero error. However, the UAVs still manage to come very close to the 
desired configuration, demonstrating the resilience of the network. The UAVs end up 
within a maximum error of 14 meters and an average error of 4 meters of the desired 
configuration. 

Scenario 3: 10% data loss. In this scenario, the probability of packet loss is increased 
to 10% while still imposing the same combined delays as in Scenarios 1 and 2. The 
UAVs will never reach steady-state; however, the system is still stable (in an input- 
output sense). The results for this case are shown in Fig. |5(c)| and Fig. |5(d)| The maxi- 
mum error in this case is considerably larger than in Scenario 2. This is caused by the 
UAV located above and to the left of the target, shown in Fig. |5(d)| 

Table Q] shows a summary of the results, where average and maximum errors are 
averaged over time. The averaging is done over time samples after the errors have settled 
within 2 meters of the final value of the errors. For each of the three cases, the same 
nonuniform constant delays are used. From the table it can be seen that increasing 
packet loss significantly affects performance. 

Table 1. Experimental Results Summary. 



Scenarios 0% 5% 10% 

Average Error over time (m) 0.0 3.9 8.5 

Maximum Error over time (m) 0.0 11.5 28.0 
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Fig. 5. Results with combined delays and varying amounts of data loss. 

6 Conclusions 

Discrete-time implementation of networked multi-agent systems introduces significant 
challenges caused by effects such as network delays and data loss. This paper proposes a 
passive-by-construction distributed network control design framework that ensures l™- 
stability in the presence of these network effects. Using steady-state analysis, we show 
how to control the agents in the multi-agent network in order to establish an n-gon upon 
deployment, and we show that output synchronization can be achieved. Simulations of 
velocity-limited quadrotor unmanned air vehicles (UAVs) are presented to show the 
performance in the presence of network delays and varying amounts of data loss. 
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Abstract. Outdoor environments bear the problem of different terrains along 
with changing driving properties. Therefore, compared to indoor environments, 
the kinematics of mobile robots is much more complex. In this paper we present 
a comprehensive approach to learn the function of outdoor kinematics for mo- 
bile robots. Future robot positions are estimated by employing Gaussian process 
regression (GPR) in combination with an Unscented Kalman filter (UKF). Our 
approach uses optimized terrain models according to the classification of the cur- 
rent terrain - accomplished through Gaussian process classification (GPC) and a 
second order Bayesian filter (BF). Experiments showed our approach to provide 
more accurate estimates compared to single terrain model methods, as well as to 
be competitive to other dynamic approaches. 

Keywords: Mobile robots, Position estimation, Terrain classification, Gaussian 
process, Machine learning. 



1 Introduction 

The kinematics of mobile robots in outdoor scenarios is much more complex than in 
indoor environments due to the varying terrain conditions. Therefore, truly reliable ve- 
locity controls for robots which are able to drive up to 4m/s or even faster (e.g. Figure 
[B are hard to design. The drivability is primarily determined by the terrain conditions. 
Thus, we developed a machine learning system which predicts the future positions of 
mobile robots using optimized terrain models. 

In this paper we present a Gaussian processes (GPs) based method for estimating the 
positions of a mobile robot. Our approach considers the different terrain conditions to 
improve prediction quality. Gaussian process regression (GPR) models are utilized to 
estimate the translational and rotational velocities of the robot. These estimated veloci- 
ties are transferred into the position space using an unscented Kalman filter (UKF). By 
projecting the uncertainty values of the GPR estimates onto the positions, the UKF en- 
ables us to also capitalize the GPR uncertainties. To distinguish the different terrains the 
robot is traversing, we classify the spectrums of vertical accelerations using Gaussian 
process classification (GPC). The transitions between terrains are modeled by a 2nd 
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Fig. 1. The experimental Longcross platform Suworow. 

order Bayesian filter (BF). This allows us to assign different probabilities to distinct 
terrain sequences, as we incorporate the properties of the classifier. 

The remainder of this paper is organized as follows: Related work is presented in 
Section[2j In Section[5]we explain GPs. In Section[4]we describe our dynamic approach. 
Experiments are shown in Section We conclude in Section 

2 Related Work 

Several works study Gaussian processes (GPs) and their application to machine learning 
problems. A detailed view is provided by Rasmussen and Williams lfT5l . Other works 
are Williams ED, and MacKay 021. 

There is a lot of research on prediction of positions or trajectories of mobile robots. 
Many different systems are proposed, like a stereo- vision approach [ 1 1, probability net- 
works [ 3 j or an approach using a particle filter combined with a Monte-Carlo method 
ifTTl . In (16] artificial neural networks (ANN) are employed to build a model predictive 
controller (MPC). The current prediction error is considered to improve the quality of 
the velocity estimations. Another least- square support vector machine (LS-SVM) based 
controller adjusts it's data model iteratively by removing the least important data point 
from the model when adding a new point 1 12]. 

Similar to the work presented here, GPs are used by Girard et al. [ 7 ] to make predic- 
tions several time steps ahead. The uncertainty of the previous step is integrated in the 
regression to track the increasing uncertainty of the estimation. Hence, the uncertainty 
value of the GP is the accumulated uncertainty of the previous time series. In contrast, 
we are able to relate the Gaussian uncertainties by using an UKF. A similar approach 
has been suggested by 1 11], and 1 10]. They estimated the trajectory of an autonomous 
blimp by combining GPR with an UKF or ordinary differential equations (ODE), re- 
spectively. Localization of wireless devices, like mobile telecommunication devices or 
mobile robots is solved by modeling the probability distribution of the signal strength 
with GPs integrated in aBF |6l. 

Several works employ information about the current terrain conditions to improve 
the navigation systems of mobile robots. One intuitive approach is to distinguish 
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between traversable and non-traversable terrain 0, lfT4l . In contrast to the binary sepa- 
ration, Weiss et al. 1 20 ] are using a SVM to classify the vertical accelerations and hence 
the terrain type. Using spectral density analysis (SDA) and principal component anal- 
ysis (PCA) Brooks et al. |2] concentrate on the vibrations a mobile robot experiences 
while driving. The preprocessed data records are categorized through a voting scheme 
of binary classifiers. Another way to analyse the driving conditions is to measure the 
slippage 0, m. 

Kapoor et al. are using GPs with pyramid-match kernels to classify objects from 
visual data. Urtasun and Darrell [18] chose a GP latent variable model to achieve a 
better classification by reducing the input dimension. 

Although many works concentrate on position estimation as well as terrain classi- 
fication, none combined GPR and GPC to solve both problems at once for a velocity 
controller of high speed outdoor robots. 

3 Gaussian Processes 

GPs are applicable to regression as well as classification tasks. In contrast to other 
methods GPs do not have any parameters that have to be determined manually. How- 
ever, the kernel function affects the properties of a GP essentially and must be chosen 
by hand. The parameters of of GPs can be automatically optimized using the training 
data. Furthermore, GPs provide uncertainty values additionally to the estimates. These 
properties makes GPs attractive for regression and classification task like position es- 
timation and terrain classification. However, a drawback of GPs is their running time 
which is quadratic in the number of training cases due to the inversion of the kernel 
matrix. 

3.1 Regression 

Computing the predictive distribution of GPs consists of three major steps. First, we 
determine the Gaussian distribution of the training data and the test data. To integrate 
the information of the training data into the later distribution, we compute the joint 
distribution. Finally, this joint distribution is transformed to the predictive equations of 
GPs by conditioning it completely on the training data. 

Let X = [xi, . . . , x n ] T be the matrix of the n training cases x\. The measured 
process outputs are collected in y = [yi, . . . , y n ] T . The noise of the random process is 
modeled with zero mean and variance a\. The kernel function is used to computed the 
similarities between two cases. Our choice is the squared exponential kernel given by 

k(xi,Xj) =exp(-^£ i -Xj\ 2 ), (1) 

where k is the kernel and x^ Xj are two inputs. A further quantity we need to provide 
the prior distribution of the training data is the kernel matrix of the training data K = 
K(X, X). It is given by K^ = k(xi,Xj) using the kernel function. Now, we are able 
to specify the distribution of y\ 

y~M(Q,K + <%I). (2) 
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We are trying to learn the underlying function of the process. To get the distribution of 
the values of the underlying function / = [/i, . . . , f n ] T we simply need to neglect the 
noise term from the distribution of the training data. 

f~AT(0,K). (3) 

Secondly, we determine the distribution of the test data. Let the set of n* test cases be as- 
sembled in a matrix as the training data, X* = [x* ± , . . . , x* n ^ ] T . /* = [/* 1 , . . . , f* n ^ ] T 
are the function values of the test cases. The normal distribution of the test data is there- 
fore given by 

/*~A/"(0,if**), (4) 

where K** = K(X*,X*) is the kernel matrix of the test cases, representing the simi- 
larities of this data points. 

To combine training and test data distributions in a joint Gaussian distribution we fur- 
ther require the kernel matrix of both sets, denoted by K* — K(X, X*). Consequently, 
the joint distribution of the training and test data is: 



y 



•M o, 



K + all K* 
Kj K** 



(5) 



This combination allows us to incorporate the knowledge contained in the training data 
into the distribution of the function values of the test cases /* , the values of interest. 

Since the process outputs y are known, we can compute the distribution of /* by 
conditioning it on y, resulting in the defining predictive distribution of GP models. 

f*\X,y,X*~Ar(E[f*], V[/*]) (6) 

with the mean and the variance given as 

E[f # ] = Kj[K + o*I\- 1 y (7) 

V[/*] = K„ - Kj[K + crll]- 1 ^. (8) 

The final distribution is defined only in terms of the three different kernel matrices and 
the training targets y. 

3.2 Classification 

The basic principle of GPC for multi-class problems is similar to GPR. Yet GPC suffers 
from the problem that the class labels p(y\f) are not Gaussian distributed. Hence, the 
distribution of the function values given the training data 

MX , y)= mmm (9) 

is also not Gaussian. X and / still denote the training data and its function values, 
whereas y now represents the class labels of the training data. Yet to use GPs we have 
to have Gaussian distributed variables and must approximate p(f\X, y), by the Laplace 
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Fig. 2. Graphical model of our dynamic approach showing the processing of the data and the 
dependencies between the terrain classification and position estimation. 

method for instance. Once the approximation is done, the further procedure is similar 
to GPR models, resulting in the predictive distribution of the GPC models f*\X,y,x*. 
Class probabilities are computed by drawing samples from this final distribution and 
applying the softmax function to squeeze the values into [0,1]. 

GPC have the same properties as GPR. The class probabilities can be seen as a 
confidence value and are perfectly suited to be combined with probabilistic filters. 



4 Dynamic Approach 

Simple controller consider only one type of terrain or use one averaged terrain model. 
In contrast, our approach considers the different effects on the kinematics of mobile 
robots caused by varying terrain conditions. Our algorithm splits naturally in two parts, 
the estimation of the robot's position and the classification of the terrain. Figured shows 
a graphical model of our dynamic approach. 



4.1 Position Estimation 

The position estimation consists of two methods. We use GPs to do regression on the 
robot's velocities and an UKF to transfer the results into the position space. 

The values used were provided by an IMU installed on the robot. A series of pre- 
ceding experiments were done to determine the composition of data types which in 
combination with the projection into the position space allowed the best position esti- 
mation. The data compositions compared in these experiments varied in two aspects, 
first, the type of the data, and second, the amount of past information. 
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Fig. 3. Original (red) and predicted (green) trajectory of a 30 second drive. The UKF error ellipses 
are displayed in blue. Starting out with a small uncertainty the error ellipses (blue) increase over 
time, due to the uncertainty of previous positions and the velocity estimates. 



The data combinations considered were: (1) the change in x-directionj and y-direc- 
tion, (2) the change in x-direction, y-direction and in the orientation, (3) the x- velocity 
and y-velocity, (4) the x-velocity, y-velocity and the change in the orientation, (5) the 
translational and rotational velocity, and (6) the translational and rotational velocity and 
the orientation change. The number of past values was altered from 1 to 4 values to de- 
termine the minimal amount of previous information necessary for the system to be 
robust, while at the same time does not cloud the data records and does not complicate 
the learning task. Leading to a total number of 24 different data compositions. While 
differing in the motion model and complexity, all approaches can be easily transformed 
into positions. The approaches (5) and (6) are utilizing lines and circular paths to ap- 
proximate the robot's trajectory, whereas (l)-(4) are just using lines. Additionally, the 
approaches (2), (4), and (6) are employing an independent estimation of the orientation, 
which may cover possible rotational slip. However, some approaches suffer from a poor 
transformation function, making them rather useless. 

The results showed that a single past value of the translational and rotational velocity 
works best for position estimation. Thus, the data records of the GPR reads as follows: 



[k-i, k, n-i, n, v t -i, ut-i] 



(10) 



Since our Longcross robot has a differential drive, U and r\ denote the values of the 
speed commands to the robot for the left and right side of it's drive. V{ and coi repre- 
sent the translational and rotational velocity, respectively. To learn a controller we need 
both the commands given to the robot and the implementation of these commands. The 
values of interest are the current velocities, v t and uj t . 

We use GPR to solve this regression task which not only provides the estimates of 
translational and rotational velocities but also gives us uncertainty values. We trans- 
form the estimates into positions using an UKF. Additionally, the uncertainties are pro- 
jected onto these positions and are propagated over time (FigureO. Starting with a quite 
certain position the sizes of the error ellipses are increasing, due to the uncertainties of 



1 All directions are relative to the robot. The x-directions points always in the front direction of 
the robot, and the y-direction orthogonal laterally. 
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the velocity estimates and the increasing uncertainties of the previous positions they 
rely on. 

Given only the first translational and rotational velocity, the regression does not rely 
on any further IMU values. It takes previous estimates as inputs for the next predictions. 
Assuming the first velocities to be zero (i.e. the robot is standing) makes our approach 
completely independent of the IMU device. However, IMU data is required for training. 

For each type of terrain we trained twqj GPR models with data recorded only on 
that terrain. This has several advantages: Firstly, the effects of the robot commands on 
a specific terrain can be learned more accurately, since they are not contaminated by 
effects on other terrain types. Secondly, in combination more effects can be learned. 
And thirdly, due to the allocation of the training data to several GPR models the size 
and complexity of each model is smaller compared to what a single model covering all 
terrain types would require. The classification described in the next paragraph enables 
us to select the appropriate terrain models for regression. 

4.2 Terrain Classification 

The terrain classification is implemented with a GPC model, and a 2nd order BF is used 
to model the transitions between terrains and to smooth the classification results. 

As for the position estimation task preceding experiments were conducted to deter- 
mine the best suited data records for the classification task. In contrast, here are mainly 
two possible data types, the vertical velocities and accelerations. 

By taking a look at the raw IMU data it is evident that the different terrains are 
hardly distingushable and that it may need to be preprocessed. To be sure, we classified 
vertical velocity and acceleration records of varying sizes. Since the single values carry 
no information about the terrain, we took vectors of 25, 30, 35, and 40 values. The 
results showed the raw data to be improper for the classification task at hand. 

The information about the terrain types is enclosed in the vibrations a mobile robot 
experiences while driving. We used the fast Fourier transform (FFT) to extract the fre- 
quency information. Given that the FFT works best with window sizes being a power 
of 2, we chose 16, 32, 64, and 128 as input lengths. Due to the symmetric structure 
of the FFT output, it is sufficient to use only the first half of the result. This reduces 
the complexity of the problem and allows us to consider larger window sizes as for the 
raw data. Using Fourier transformed vertical velocities and accelerations improved the 
classification results, as was expected. We found the vector of 128 vertical acceleration 
values to work best for our task. Hence, the input to the GPC model is given by 

[fftazo, . . . Jftazes], (11) 

where fftazi denotes the ith value of the FFT result. However, we did not consider 
input lengths of 256 or longer, since it would take more than 2 seconds to collect the 
data for a single classification. 

If the robot's speed is too slow the terrain characteristics are not longer present in 
the vibrations, and therefore a good classification is impossible. Thus, we omit data for 
which the robot's translational velocity is below some threshold r. 

2 GPs are not able to handle multi-dimensional outputs. Hence, we need one GPR model for 
each velocity, translational and rotational. 
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Fig. 4. Estimation of the translational velocities (upper row), and of the rotational velocities 
(lower row) with error bars. Even if the differences between the models are not significant, they 
result in rather distinct trajectories. 



Due to the native charaters of the terrains, the driving conditions vary within a terrain 
type. This makes it hard to achieve a perfect classification. To deal with false predictions 
and to account for the properties of the classifier, we used a 2nd order BF to model the 
transitions between terrains. The quality values GPC models provide, come in terms of 
class probabilities which are well suited to be included into a probabilistic filter like the 
BF. The defining equation of our 2nd order BF is slightly modified to accommodate the 
classification and the dependency on two previous beliefs. 



6(:C t | t _l)0C ^ X^ P( X t\t-l\Xt-l\t-l,X t -2\t-2) 

%t-l\t-l x t-2\t-2 

K X t-l\t-l) -K X t-2\t-2) 



(12) 



The notation x t \ t _i denotes the value of the state x at time t given all observations up to 
time t — 1. &(•) is the belief, andp(2^|t-i|#t-i|t-i> #t-2|t-2) represents the transition 
probabilities, i.e. the dependency on the previous states. At this point the belief of the 
current state x t depends only on the observations up to z t -\. We include the current 
observation z t via the result of the classifier c t . 



H x t\t) 



P(xt\t-i\ct,z t )p(ct\zt) 

P(%t\t-i\zt) 



b(x t \ t _ 1 ), 



(13) 



where p(c t \z t ) is the classification result. The output of our BF is the belief of state 
x t given all observations including z t . It considers the classification and two previous 
states. 

As already mentioned, the terrain classification is used to determine on which terrain 
the robot is currently driving and hence which terrain model should be used to estimate 
the next position. Since the robot commands required for the position estimation are 
given at a rate of approximately 3 Hz, whereas the terrain classification relies only on 
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Algorithm 1. Dynamic Approach. 

Input: initialized and trained GPR and GPC models 

while sensor data q* available do 
2: get p* and q* from sensors 

if p* available and speed > r then 
4: s* = FFT(p*) 

t p = GPC.classify(s*) 
6: b p = BF.process(tp) 

model = selectModel(frp) 
8: end if 

[v,uj] = GPR.predict(raode/,g*) 
10: pos = UKF.process(v, u) 

pub\ish(pos) 
12: end while 

The algorithm requires trained GP models. If sensor data q* is available, the algorithm predicts 
the velocity values using the current terrain model. If sufficient vibration data p* is available for 
classification (omitting low speed data), a new classification of the terrain will be performed and 
the current model will be updated. 



IMU data provided at 100 Hz, we implemented both parts to work independently of 
each other. However, it takes about a second to acquire enough data for classification, 
thus we reuse the last terrain model until a new classification is available. Below we 
will refer to the terrain models used by the GPR as applied models to distinguish them 
from the fewer classifications which were actually performed. Algorithm!]] summarizes 
our approach. 

5 Experimental Results 

The Longcross (Figure G} is an experimental platform weighing about 340 kg with a 
payload capacity of at least 150 kg. The compartment consists of carbon-fibre and is 
environmentally shielded. Our version is equipped with a SICK LMS 200 2D laser 
scanner, a Velodyne Lidar HDL-64E S2 3D laser scanner, and an Oxford Technical 
Solutions Ltd RT3000 combined GPS receiver and inertial unit. The software runs on a 
dedicated notebook with a Intel Core 2 Extreme Quad processor and 8 GB memory. 

5.1 Naive Approaches 

We evaluated our approach on a test drive covering three different terrains, starting 
on field, continuing with grass, and ending on asphalt. The test sequence consists of 
190 positions (about a 1 minute) almost evenly partitioned on the three terrains. We 
compared our dynamic approach to afield model, an asphalt model, a grass model and 
an averaged model. The first three models were trained only on the specific terrain. The 
averaged model was trained on all three terrains equally. All models used a total of 1500 
training cases. The dynamic approach utilizes the three terrain specific models. 
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We used the mean squared error (MSE) and the standardized MSE to evaluate the 
quality of the regression. The SMSE is the MSE divided by the variance of the test 
points, thus it does not depend on the overall scaling of the values. The MSE is not 
applicable to measure the quality of a trajectory because errors at the beginning are 
propagated through the entire trajectory influencing all proceeding positions. So we 
introduced the mean segment distance error (MSDE). First, we split the original trajec- 
tory and the prediction in segments of fixed size. Each pair of segments is normalized 
to the same starting position and orientation. The distance of the resulting end positions 
is computed subsequently. The mean of all these distances constitutes the MSDE. We 
found a segment size of 10 to be a reasonable size. 

The estimation results of the translational (upper row) and the rotational velocities 
(lower row) of the field model, the averaged model and of our dynamic approach are 
shown in Figured The field model is the best of the simple models. The averaged model 
would be the model of choice if one wants to consider different terrain conditions but 
does not want to employ a dynamic approach. 

The estimation of the translational velocities by the averaged model shows little 
or no reaction to any of the outliers. Averaging the effects of commands over several 
terrains involves diminishing effects of certain command-terrain combinations, hence 
causing inaccurate estimates. The field model performs best of the simple approaches, 
mainly because it recognizes the slowdown around the 70th test case. However, the 
other two outliers are not identified. The estimates of the remaining test cases are similar 
to the averaged model, yet the field model is slightly superior. Our dynamic approach 
outperforms both previous models. In contrast, it recognizes all outliers, as the right 
terrain model is applied most of the times. Additionally, the estimates of each test point 
are pretty accurate. 

The estimates of the rotational velocities give a similar picture. The averaged model 
is somewhat off at the beginning, underestimating the true values. The field model is 
much more accurate. The error bars seem to be larger than in the upper row but it is 
due to the scale of the data. However, our approach again provides the best estimation, 
reacting even to minor changes in the velocity values. 

The velocities are translated into positions using an UKF. The resulting trajectories 
of all models tested are shown in Figure [3 The previous tendencies are reflected in the 
quality of the trajectories. The averaged model's trajectory is rather inaccurate, followed 
by the trajectory of the field model which is outperformed by our dynamic approach. 
At the beginning the predicted tracks are close, the differences start during the first turn 
and increase by the time. Even though the distinction between the velocity estimates are 
relatively small, the impact on the trajectories are formidable. Especially the rotation 
values are crucial to the trajectory quality. 

We analyzed the quality of the GPR estimations of the velocities with the MSE and 
the SMSE. Due to the reasons mentioned above, the trajectories are evaluated by the 
MSDE. The results are presented in Table[TJ 

We also examined the classification performance. Figure[6]displays the results of the 
classification after applying the BF, and the terrain models used for the regression. The 
figures show that the classifier tends to categorize field as grass vibrations and grass 
records as asphalt. One problem is that the field and grass terrains are alike. At the low 
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Table 1. Prediction quality. The translational velocity unit is m/s, the rotational velocity unit is 
rad/s, and the MSDE unit is m. 



Model 




MSE 


SMSE 


MSDEio 


Field 


V 


0.02522 


0.10588 


0.42923 


uj 


0.00331 


0.23605 


Asphalt 


V 


0.07032 


0.29517 


0.56363 


OJ 


0.00533 


0.38034 


Grass 


V 


0.05161 


0.21662 


0.51907 


UJ 


0.00365 


0.26041 


Averaged 


V 


0.06203 


0.26038 


0.45478 


CJ 


0.00377 


0.26929 


Dynamic 


V 


0.01181 


0.04956 


0.31407 


UJ 


0.00279 


0.19917 
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Fig. 5. Predicted trajectories. Starting the models' predictions are exact but differ increasingly 
along the time line, leading to unlike trajectories. 
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Fig. 6. Classification results (upper image) and applied models (lower image). The classification 
matches the sequence of field (red triangle), grass (green circles), and asphalt (grey squares) with 
high accuracy. Transitions are not classifiable because the records contain data of two terrains. 
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Table 2. Distribution of the classification results of the GP- Approach. 



Classification 



Field 



£ Asphalt 

3- 



O Grass 



Original Terrain 



Field 



100.00% 



0.00% 



0.00% 



Asphalt 



0.00% 



93.75% 



6.25% 



Grass 



5.56% 



0.00% 



94.44% 



Table 3. Distribution of the applied models of the GP- Approach. 



Applied 
Models 


Original Terrain 
Field Asphalt Grass 


Field 


100.00% 


0.00% 


5.71% 


§ Asphalt 


0.00% 


95,52% 


0.00% 


O Grass 


0.00% 


4.48% 


94,29% 



speed of approximately 1 m/s the vibrations experienced on grass are close to the ones 
recorded on asphalt. Nonetheless, almost all false classifications are compensated by 
the BR 

Table [2] and [3] list the classification quality in terms of correct classifications and ap- 
plied models, respectively. The GP- Approach determined the correct class with 96.06% 
accuracy, and hence, applied the right model in 96.84% of all cases. The results are bro- 
ken down into the single terrain classes, giving more insight into the classifier's bias. 



5.2 Dynamic Approaches 

In the previous section we compared our dynamic approach to single terrain models, 
which by definition cannot be optimal when the robot drives on several different ter- 
rains. Therefore, we constructed another dynamic approach similar to the one presented. 
By doing so, we combined a support vector machine (SVM) to perform the regression 
of the velocities, and a k-nearest neighbor (k-NN) method for classification. We chose 
to combine a SVM and k-NN because of their low runtimes. The conducted experiments 
showed the overall runtime of the GP approach to be a serious issue. 

We omitted the UKF from the system because the SVM does not provide any uncer- 
tainty values. These values are essential to the UKF in order to work properly. However, 
we kept the 2nd order BF, even though the k-NN method usually returns simple class 
labels rather than class probabilities. We counted the votes for each class and divided 
them by the total number of neighbors k to provide class probabilities anyway. We used 
the LibSVM library [4 ] with a radial basis function (RBF) as SVM kernel and found the 
OpenCV k-NN implementation with k = 8 neighbors to work best for vibrations. 

To train the terrain models of the SVM we used the same training data as we did for 
the GP models. The training set of the classifiers are also unchanged. 
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Fig. 7. Comparison of the two dynamic approaches. Estimation of the translational velocities (left 
column), and of the rotational velocities (right column) with error bars. 



Since this approach is able utilize several terrain models for prediction, it should 
be more competitive than the single terrain model approaches. To distinguish the two 
dynamic approaches, we will refer to the GP based system as GP- Approach and to the 
SVM and k-NN system as SVM- Approach. 

The top row of Figure [7] shows the estimations of the translational and rotational 
velocities by the GP- Approach from the previous section. The prediction results of the 
SVM- Approach are illustrated below. Both dynamic approaches fit the original transla- 
tional velocities fairly accurately and recognize all outliers. The estimation differences 
between the two systems are somewhat more apparent when we concentrate on the ro- 
tational velocities. The SVM- Approach underestimates the first part and overestimates 
the velocities towards the end. Its predictions are more agitated in contrast to the aver- 
aging characteristic of the GP estimations. 

The observations are reflected in the error values of Table |4] Even though the errors 
of the dynamic approaches are very small - they are the smallest of all tested models - 
the values of the SVM- Approach are more than | higher. As we will see later, this is 
partly due to the lower k-NN classification quality. 

The SVM's underestimation of the rotational velocity corresponds to the assumption 
of a less sharper turn. The green trajectory in Figure [8] shows exactly this behavior. 
Furthermore, the higher estimated rotational velocities at the end of the trajectory result 
in tighter turns. Regardless of the rather different predictive trajectories, the trajectory 
qualities in Tableware almost the same. 

With respect to the classification task the GP classifier performs much better than 
the k-NN classifier. The sequence of classifications and applied models are compared 
in Figure [9] While the GP- Approach keeps the current terrain until enough evidence 
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Table 4. Prediction quality of the two dynamic approaches. The translational velocity unit is 
m/s, the rotational velocity unit is rad/s, and the MSDE unit is m. 



Model 




MSE 


SMSE 


MSDEio 


GP 


V 


0.01181 


0.04956 


0.31407 


Approach 


OJ 


0.00279 


0.19917 


SVM 
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0.01923 


0.08073 


0.30336 


Approach 


UJ 


0.00464 


0.33089 
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Fig. 8. Predicted trajectories of the two dynamic approaches. 
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Fig. 9. Comparison of the two dynamic approaches. Classification results (upper image) and ap- 
plied models (lower image). Transitions are not classifiable because the records contain data of 
two terrains. 



is present that the terrain may have changed, the SVM- Approach tends to change the 
terrain class more quickly. This makes the system to apply the wrong terrain model 
more often. 

Table[5]and[6]point out that the k-NN classifier struggles to distinguish grass and field 
records. The overall performance of our GP- Approach is 96.06% correct classification 
in contrast to the SVM-Approach's 70.00%. This is a gap in performance of more than 
25%. Evaluating the quality of the applied models, we can see a slight improvement 
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Table 5. Distribution of the classification results of the dynamic approaches. 



Classification 


Original Terrain 
Field Asphalt 


Grass 




GP 


SVM 


GP 


SVM 


GP 


SVM 


Field 


100.00% 


71.43% 


0.00% 


0.00% 


5.56% 


33.33% 


$ Asphalt 


0.00% 


0.00% 


93.75% 


93.75% 


0.00% 


11.11% 


O Grass 


0.00% 


28.57% 


6.25% 


6.25% 


94.44% 


55.56% 



Table 6. Distribution of the applied models of the dynamic approaches. 



Applied 
Model 


Original Terrain 
Field Asphalt 


Grass 


GP 


SVM 


GP 


SVM 


GP 


SVM 


Field 


100.00% 


71.70% 


0.00% 


0.00% 


4.29% 


27.14% 


§ Asphalt 


0.00% 


0.00% 


95.52% 


95.52% 


0.00% 


10.00% 


O Grass 


0.00% 


28.30% 


4.48% 


4.48% 


95.71% 


62.86% 



by the SVM- Approach of about 9%, resulting in 78.86% correctly applied models. In 
comparision the GP- Approach applied the right model in 96.84% of the cases, still 
staying way ahead. 

As stated earlier the classification quality is crucial to the performance of the over- 
all system, since false classifications lead to the application of wrong terrain models, 
resulting in worse velocity estimations. However, the SVM- Approach, incorporating 
SVM regression and k-NN classification, is significantly faster than our GP- Approach. 



6 Conclusions and Future Work 



In this paper we introduced a dynamic approach to estimate positions of a mobile robot. 
Since the terrain conditions are crucial to the robot's implementation of velocity com- 
mands, we consider the terrain for the prediction of future positions. We used GPs for 
the regression of translational and rotational velocities. An UKF transfers the velocity 
estimates into positions and propagates the uncertainties of the positions over time. The 
vibration affecting the robot are classified by a GPC model in order to separate different 
terrains. A 2nd order BF is used to compensate for classification errors and to model 
the terrain transitions. 

The prediction problem considered in this paper has several difficult properties: first, 
with respect to the classification task, the visual ground truth is not always accurate. 
Due to natural terrain types, which in themselves can be vastly different, vibrations in- 
duced by separate terrains may sometimes look alike or even show characteristics of 
another terrain. In a consequence, making it tricky to evaluate the classifier's perfor- 
mance. Second, finding a reasonable diversity of terrains is quite challenging, which 
aggravates the task of determining the transition probabilities from real data. 
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Nevertheless, our approach proved to be more effective than simple approaches uti- 
lizing a single model for all terrains. Furthermore, our GP- Approach is competitive 
with other dynamic approaches using different machine learning techniques. The dy- 
namic systems, our GP- Approach and the slightly structurally altered SVM- Approach, 
are both better than all simple models, proving the basic idea of our approach to be well 
suited to solve such prediction problems. 

Future work will be to reduce the run time. Due to the use of GPs the entire systems 
suffers of a long run time. Several sparse algorithms for GPs exist which would help to 
improve this issue. We are working on the implementation of such a sparse method at 
the moment. 

Also GPRs and GPCs were compared to other well known methods for regression 
and classification, respectively. The run time of SVMs is smaller while performing 
equally. However, this would mean losing the uncertainty estimates, i.e. the quality 
indicators. This trade-off might be worth further investigation. Therefore and to deter- 
mine the benefit of such a dynamic method in practice, we are currently integrating the 
SVM-based version in our local navigation software. Once the run time of the GP-based 
version allows the use in online applications, the integration in our local navigation 
framework will follow. 

Moreover, expanding the current system by additional terrains as sand or gravel 
would be desirable. Extending the classification records by the translational velocity 
could improve classification results, since it allows to learn the effects of different 
speeds on the vibrations. 
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Abstract. In the paper we want to present a problem of path following for non- 
holonomic mobile manipulators. In our consideration we restrict ourself to dou- 
bly nonholonomic mobile manipulators. Nonholonomic constraints appear due 
to motion of the robotic system without slipping effect between wheels and the 
surface (for nonholonomic platform) or between special supporting wheels and 
ball in an onboard manipulator equipped with nonholonomic gears. As a point 
of departure for mathematical formulation of this problem we will use so-called 
Frenet parametrization. It establishes some tool to find some control laws solving 
the formulated problem. A role of orientation in the path tracking will depend on 
the user-choice and number of control inputs. 

Keywords: Mobile manipulator, Nonholonomic constraints, Path following, 
Frenet parametrization. 



1 Introduction 

An object of our considerations will be a special robotic system, i.e. a nonholonomic 
mobile manipulator. The mobile manipulator consists of a mobile platform, equipped 
with non-deformable wheels, and an onboard rigid manipulator mounted on the 
platform. 

The problem with a path definition for the end-effector of the mobile manipulator 
is that behavior of the subsystems is unpredictable, because the same path defined in 
global coordinates can be executed by separate subsystems or by both of them. Some- 
times, it is important to move the platform and simultaneously unload a payload: such 
a task is defined relative to the base of the manipulator mounted on the platform; defi- 
nition relative to the end-effector is ill-conditioned. In such situation the decomposition 
of the task into sub-tasks defined separately for both subsystems is more natural and 
convenient 0. 

In the paper we will make an assumption that a desired task can be decomposed 
into two separate sub-tasks defined for each subsystem independently: the end-effector 
has to follow a desired geometric path II (s) described relative to the base of manip- 
ulator (i.e. relative to the platform) and the task of the platform is to follow a desired 
curve P(s) lying on a plane. Such a formulation of the task makes possible successive 
unloading of payload transported by the mobile manipulator during the control process. 
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Fig. 1. Decomposition of a task for doubly nonholonomic mobile manipulator: P(s) - desired 
path for a nonholonomic platform, 77 (s) - desired path for a nonholonomic manipulator. 

If we take into account the type of components mobility of mobile manipulators, we 
obtain 4 possible configurations: type (ft, ft) - both the platform and the manipulator 
holonomic, type (ft, nft) - a holonomic platform with a nonholonomic manipulator, 
type (n ft, ft) - a nonholonomic platform with a holonomic manipulator, and finally type 
(nft, nft) - both the platform and the manipulator nonholonomic. The notion doubly 
nonholonomic manipulator was introduced in |[TQl for the type (nft, nft). 



2 Mathematical Models of Doubly Nonholonomic Mobile 
Manipulator 

In this paper we restrict our considerations to mobile manipulators of (nft, nft) type, 
i.e. to doubly nonholonomic objects. We will discuss the constraints occurring in the 
motion of both subsystems. Nonholonomic constraints appearing in the motion of me- 
chanical systems come from different sources. Very often they come from an assump- 
tion that a motion of the system can be treated as pure rolling of components, without 
slippage effect. We have taken such assumption in description of constrained motion of 
the considered mobile manipulator. 

It is worth to mention that there are other sources of nonholonomic constraints, i.e. 
principle of conservation of momentum, non-integrable relationship between orienta- 
tion of a rigid body expressed in local and global coordinates etc. 

In the next step we want to calculate dynamics of the whole complex system i.e. 
mobile manipulator with nonholonomic restriction in the motion. 



2.1 Nonholonomic Constraints for Wheeled Mobile Platform 

Motion of the mobile platform can be described by generalized coordinates q m G R n 
and generalized velocities q m G R n . The wheeled mobile platform should move with- 
out slippage of its wheels. It is equivalent to an assumption that the momentary velocity 
at the contact point between each wheel and the motion plane is equal to zero. This 
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assumption implies the existence of I (I < n) independent nonholonomic constraints 
expressed in Pfaffian form 

Ai(q m )qm = 0, (1) 

where A\ (q m ) is a full rank matrix of (I x n) size. Since due to (Q} the platform velocity 
is in a null space of A(q m ), it is always possible to find a vector of special auxiliary 
velocities r] G R m , m = n — I, such that 

q-m = Gi(q m )rj, (2) 

where G\ is an n x m full rank matrix satisfying the relationship A\G\ = 0. We will 
call the equation (0 the kinematics of the nonholonomic mobile platform. 

2.2 Nonholonomic Constraints for Manipulator 

A rigid manipulator can be a holonomic or a nonholonomic system - it depends on 
construction of its drives. In 17llt the authors have presented a new nonholonomic me- 
chanical gear, which could transmit velocities from the inputs to many passive joints, 
see Fig. [2] The prototype of 4-link manipulator with such gears has been developed in 
last 1990th years. The nonholonomic constraints in the gear appear by assumption on 
rolling contact without slippage between balls of gear and wheels in the robot joints. 

The basic components of the gear presented in Fig.|2]are a ball and three wheels - an 
input wheel IW and two output wheels OWi and OW2 . The velocity constraints of the 
ball are only due to point contact with the wheels. The input wheel IW is mounted in the 
first joint, the output wheels are mounted in the next joint. The wheel IW rotates around 
the fixed axis oli with an angular velocity 112, which plays the role of a control input. 
The rotating input wheel IW makes the ball rotate. The wheel OWi rotates around an 
axis ao, which makes with the axis of the input wheel a joint angle 8\. The angular 
velocity 0\ = u\ is the second control input for the manipulator with nonholonomic 
gears. 

Nonholonomic constraints (the kinematics) of n-pendulum have the form 

0i=ui, (3) 

2-2 

0i = disiriQi-x Y\ cos 9 jU 2 , i G {2, . . .,n}, (4) 

with positive coefficients ai depending on gear ratios or the same in matrix form 

: =G 2 (q r )u. (5) 

w 

It is worth to emphasize that only two inputs u\ and u 2 are able to control many joints 
of manipulator equipped with gears designed by Nakamura, Chung and S0rdalen. 
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Fig. 2. Nonholonomic gear: a) schematic of the the gear, b) the gear seen from above. 



2.3 Dynamics of Doubly Nonholonomic Mobile Manipulator 

Because of the nonholonomy of constraints, to obtain the dynamic model of mobile 
manipulator the d' Alembert Principle has to be used 

Q(q)q + C(q, q)q + D(q) = A n (q rn )X 1 + A 21 (q r )X 2 + Br, (6) 

where: 

Q(q) - inertia matrix of the mobile manipulator, 

C(q,q) - matrix of Coriolis and centrifugal forces, 

D(q) - vector of gravitational terms 

An - matrix of nonholonomic constraints for zth subsystem, 

Xi - vector of Lagrange multipliers for zth subsystem, 

B - input matrix, 

r - vector of controls, 

Q T — (Qmi Or) ~ generalized coordinates of the mobile manipulator. 

Matrices An(q m ) and A 2 i (q r ) are Pfaff 's matrices for the platform and the manipulator 
respectively, whereas B is so-called input matrix 

X(<?m)l , _ r o 

' Ml ~ AT(q r ) 



An 



B 



B 1 
B 2 



Submatrices Bi and B 2 describe which coordinates of both subsystems are directly 
driven by actuators. Equations of constraints (0 and © can be rewritten as the kine- 
matics in one block as follows 



Gi 
G 2 



G(, 



C = 



(7) 



where £ is the vector of auxiliary velocities for both subsystems. After substituting the 
equation (0 into the dynamics © we get 



Q*C + C*( + D* = B*t 



(8) 



Orientation of Doubly Nonholonomic Mobile Manipulator 175 

with elements defined in the following way 

Q* = G T QG, C* = G T (QG + CG), D*=G T D, B*=G T B. 

The equation ([5]) describes the dynamics of doubly nonholonomic mobile manipulator 
expressed in the auxiliary coordinates. 

3 Control Problem Statement 

As it has been mentioned in the introduction, the desired task for the mobile manipulator 
can be decomposed into two independent parts: the end-effector of robotic arm has to 
follow a geometric path 77 (s) described relative to the platform without stopping at the 
end and the task of the platform is to move continuously along desired path P(s) lying 
on the plane, see Fig.Q] 

A goal of this chapter will be to address the following control problem for mobile 
manipulators: 

Design a control law r such that a mobile manipulator with fully known dy- 
namics follows the desired paths defined separately for each subsystem, and 
tracking errors converge asymptotically to zero. 

Note, that a complete model of the nonholonomic system has a structure of two cas- 
caded equations: kinematics (nonholonomic constraints) and dynamics. For this reason 
the backstepping-like procedure for the designing of control law should be used, : 

- kinematic controller ( r {t) - represents an embedded control input, which ensures 
the realizability of the path following for the nonholonomic constraints. Kinematic 
controller can be treated as a solution to the kinematics ©, if the dynamics were 
not present. Such the controller generates a 'velocity profile' which can be executed 
in practice. The convergence of the kinematic control algorithm must be proven. 

- dynamic controller r - as a consequence of cascaded structure of the model, the 
system's auxiliary velocities ( cannot be commanded directly, as it is assumed in 
the designing of kinematic control, and instead they must be realized as the out- 
put of the dynamics ® driven by r. The dynamic input r intends to regulate the 
real velocities ( toward the reference control ( r and, therefore, attempts to provide 
control input necessary to achieve the desired task. 

4 Description of Nonholonomic System Relative to a Given Path 

For nonholonomic systems whose workspace is planar, it is possible to describe state 
variables relative to global inertial frame as well as to a given path [4], see Fig. [3] 

The path P is characterized by a curvature c(s), which is the inversion of the radius 
of the circle tangent to the path at a point characterized by the parameter s. Consider a 
moving point M and the associated Frenet frame defined on the curve P by the normal 
and tangent unit vectors x n and ^ . The point M is the mass center of a mobile platform 
and M' is the orthogonal projection of the point M on the path P. The point M' exists 
and is uniquely defined if some conditions are satisfied, see 1 2 ] for details. 
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Fig. 3. Illustration of path following problem for the nonholonomic platform. 

The coordinates of the point M relative to the Frenet frame are (0, 1) and relative to 
the basic frame XqYo are equal to (x, y), where I is the distance between M and M\ A 
curvilinear abscissa of M' is equal to s, where s is a distance along the path from some 
arbitrarily chosen point. 

If we want to express the position of the point M not in coordinates (x, y) relative to 
inertial frame, but relative to the given path P, we should use some geometric relation- 
ships, m, 



I = (— sin# r cos6 r ) 



s = 



(cos* 



-) 



1 - c(s)l 



(9) 



where x and y are defined by nonholonomic constraints for the system (wheeled mobile 
platform or nonholonomic manipulator) and 6 r is a desired orientation at the point M' 
on the path. 

4.1 Path Following with Desired Orientation 

Sometimes it is very important, especially for nonholonomic platforms, to keep the 
orientation of the subsystems under control. In such a situation we have only two control 
inputs and more than two coordinates to control. Therefore we make take an assumption 
that we will not to regulate the variable s. Such the assumption is good conditioned 
because for asymptotic path tracking (we have formulated in such a way our control 
problem) 



Nonholonomic Mobile Platform. Posture of the mobile platform is defined not only 
by the position of the mass center, but by the orientation, too. For this reason, it is 
necessary to define the orientation tracking error equal to = 6 — 6 r . Moreover, at the 
point M' the desired orientation of the platform fulfills a condition, (HJ, 



Orientation of Doubly Nonholonomic Mobile Manipulator 177 

r = c(s)s. (10) 

Then the coordinates 

£=(Z,0,s) T (11) 

i.e. the Frenet coordinates (Z, s) and orientation tracking error 6, constitute path fol- 
lowing errors for nonholonomic mobile platform. It is worth to mention that Frenet 
parametrization is valid only locally, near the desired path. 

If we want to solve the path following problem, it is necessary to express the kinemat- 
ics of nonholonomic mobile platform in Frenet coordinates (ITTb instead of generalized 
coordinates q m . After using equations © and dTUb . the nonholonomic constraints can 
be represented by the dynamic driftless system. 

In our solution to the path following problem for nonholonomic wheeled mobile 
platform we have omitted the differential equation for s, as we mentioned earlier. It 
means that it is enough to consider only Z and 6. 

For example, the nonholonomic constraints for the platform of the class (2,0) are 
given by 

f x\ |~cos#0l / x 

V ) =G 1 (q rn )r 1 . (12) 

After substituting (fT2l) into ([5]) and dTUb . we get Frenet variables for the mobile platform 
of (2, 0) class in the form 

I = v8m6, (13) 

< s ) _ ... 



"cos(9 0" 


sin 


1 



vcosQ 



1 - c(s)l 



where w is a new control input for the second equation. 

For the system (fT3l) we can use many kinematic control laws, e.g. Samson algorithm 
introduced in [9|, 



v r \ __ const 

w r ) ~ ( Vr ml-k 3 e, 



k 2 M>^ (14) 



which is asymptotically stable. It can be shown using the following Lyapunov-like 
function 

V(l,~6) = k 2 l - + °- (15) 

and Barbalat lemma. 

Path following with the desired orientation is very important for mobile systems, 
especially for mobile manipulators. It comes e.g. from the fact that it would be impossi- 
ble to unload a payload if the platform had wrong orientation, i.e. it would be in a right 
place but it would be back-oriented. 
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Fig. 4. Schematic of 3-link nonholonomic manipulator. 

Nonholonomic Manipulator. For a nonholonomic manipulator it is possible to follow 
the desired path with prescribed orientation. However, this issue has a drawback: the 
nonholonomic manipulator has only two control inputs, therefore it is impossible to 
have all parameters (l,s,6) under control. In such a case many authors decide to regu- 
late only two tracking errors (/, 6) to zero. Such a case of the path following problem is 
so-called the asymptotic path following. 

The Frenet parametrization can be evoked once again in the problem of path follow- 
ing for the planar manipulator, e.g. 3 -pendulum with nonholonomic gears moving on 
the XZ surface, see Fig. [4) The role of the point M in Fig.[5]plays a point at the end of a 
gripper. The orientation of the end-effector 6 m is an rotation angle of the frame associ- 
ated with the gripper around -Y& axis, which is located in the base of the manipulator. It 
means that the orientation of the end-effector in the planar nonholonomic n-pendulum 
is then equal to 

n 
i=l 

In the considered planar nonholonomic manipulator lying in XZ-plane, relationships 
between velocity of the working point M expressed in Cartesian and curvilinear coor- 
dinates have the form 

l m = (-sme rrn cosfl rm )^.j, s= 1 _ c{8)lm [i), (16) 

where l m denotes distance between the point M and the path 77 (s), and rrn is the ori- 
entation of the Frenet frame in the point M'. Subscripts were introduced to distinguish 
Frenet variables for both subsystems of the (nft, nh) mobile manipulator. 

Coordinates of the end-effector in the n-pendulum relative to its base are equal to 

n i n i 

x = ^2kcos(^20j) z = ^2lism(^20j). (17) 

Substituting time derivatives of variables (fTD) , we obtain the following equations 

n i i 

im = ^cos(# rm ~y^pj) -uy^Ok, (is) 

i=l j=l k=l 
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n / \ n i i 

L = J2^- 1 _ c ( s )i ■ I> in ^™ " J2 e ^ li J2^- (19) 

i=l ^ ' m i=l j=l k=l 

Using the kinematics of the nonholonomic manipulator given by ©-(111, the equations 
(fT8l) and dT9t can be expressed in the matrix form as follows 



~ m =H(q r ,€ m ) 



/< 



V 



'1 

H(q r ^m)G 2 (qr)u = Ki(q r ,£ m )u. (20) 



Matrix Ki(q ri £ m ) fulfills the regularity condition (i.e. it is invertible) if some configu- 
rations, which imply the matrix singularity, are excluded from a set of possibly achieved 
poses of the nonholonomic manipulator. 

For nonholonomic 3 -pendulum matrix Ki(q r ,£ m ) has the form 



Ki(q r ,im) 



Km Kn2 

Ki2l K122 



with elements defined below 

3 i 

Km = ^2h cos(<9 rm - ^ 0*), 
i=i j=i 

3 i 3 

Kn2 = a 2 Si ^ l i COS (#™ ~ ^ Oi) + CI3S2C1I3 COs(# rm - ^ ^), 
i=2 j=l j=l 

( \ 3 i 

Km = 1 - 1 _ t\, J2 h sin ^™ - J2 *<)> 

2=1 j = l 

^22 = a 2 5i [1 - - _ C( ^\ 7 ^ Z< sin(0 rm - ^ 00] 
l c{s)l m , =2 j=i 

c(s) 3 

+ a 3 5 2 ci[l - — — — — Z 3 sin(0 rm - ^0»)]. 

3 = 1 

Note that Frenet transformation is valid only locally, i.e. Z m (0) < r m i n , therefore nom- 
inators of all fractions are well defined. The nonholonomic planar 3-pendulum cannot 
achieve angles equal to 0i, 82 = 0, ±tt. Moreover, singularities in K\ matrix appear for 
sin(0 rm - Oi) = sin(0 rm - 6 1 - 6 2 ) = sin(0 rm - X - 2 - 3 ) = 0. 

For the regular matrix i\^, the following control signals guaranteeing a convergence 
of tracking errors to zero for pure kinematic constraints can be proposed 

u r = -K-\q r ,U)AU, A = A T > 0. (21) 

It is easy to observe that the system d20l) with closed-loop of the feedback signal (f2TT> is 
asymptotically stable and has a form 

im + A£ m = 0. 
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4.2 Path Following without Desired Orientation 

The manipulator with gears designed by Nakamura, Chung and S0rdalen has two con- 
trol inputs. It means that only two parameters can be regulated during the path following 
process. If we mean that the orientation of the end-effector of such manipulator is not 
very important, it is possible to control other Frenet parameters, e.g. l m - distance error 
from the desired path and curvilinear length s of the path. 
In such a case the following differential equations 

n i i -j n i i 

z=l j=l k=l \ ) m i=1 . =1 k=1 

have to be considered. Similarly to (fT8l) and dT9b . using the kinematics of the non- 
holonomic manipulator ©-(E]), these equations can be expressed in the matrix form as 
follows 



H(q r ^m)\ \ \ = H(q r ^ m )G2{q r )u = K a (q r ,£ m )u. (22) 



Matrix K s (q rj ^ rn ) fulfills the regularity condition (i.e. it is invertible) if some configu- 
rations, which imply the matrix singularity, are excluded from a set of possibly achieved 
poses of the nonholonomic manipulator. 

For nonholonomic 3 -pendulum matrix K s (q r ,£ m ) has the form 



Ks(qr,£m) = 


K s n K s i2 
K S 21 K S 22_ 


•> 


with elements defined below 


3 i 

K sll = ^k cos(6 rm - y^flj), 


3 i 3 
K s i 2 = a 2 Si ^ l i c °s(Qrm -^2®i)+ CL3S2C1I3 COs((9 rm - ^ #;), 


i=2 j=l j=l 


/ x 3 i 

K s2 i= ( , ^ U sin(fl rm y^Qj), 

1 c{s)l m . =i j=i 


a 2 s 1 c(s) ^ . ^ 

Ks22= l-c(,)/ m ^^ m( ^™ X 

i—2 j — 


»+r; 


l 3 sm(6 rm 

{S)lm 



Singular configurations of nonholonomic 3 -pendulum for the matrix K s are equal to 
configurations sin(0i) = sin(0i —62)= sin(0i — 62 — #3) = 0. 
If the following control law is applied 

u r = -K~ 1 (q r ^ m )v 1 (23) 
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where v G R 2 is a new input to the system (I22L then we obtain the decoupled and 
linearized control system of the form 

YMs)- <«> 

Now it is possible to control each variable separately. For instance, if we want to move 
along the desired path, not only to converge to this path, it seems to be a good idea to 
get 

vi = —Al m , V2 = const, A > 0. 

Such control algorithm guarantees the motion along the geometrical curve with constant 
velocity and, simultaneously, the convergence of the distance tracking error l m to 0. 

5 Dynamic Control Algorithm 

Lets consider the model of a mobile manipulator ([5]) with nonholonomic constraints 
©. We assume that reference kinematic controls ( r (t) = (rjj (t) , uj (t)) solve the 
path following problem for both nonholonomic subsystems, where rj r (t) was computed 
due to Samson algorithm (fT4l) for the mobile platform and u r (t) for nonholonomic 
manipulator is given by (l2Tt or d23l) . 

Then we propose a dynamic control law 

r = (5*)- 1 JQ* (( r - Ke^j + C*( + #*} (25) 

with symbols defined as follows 



__ i e v \ _ I rj-r] r 



Cc > ' e u I \u-u r i ' 



K X I 2 
K 2 h 



and K\,K 2 > 0, which preserves asymptotic convergence for full kinematic and dy- 
namic coordinates of the (nh, nh) mobile manipulator to their desired values. 

6 Simulations 

As an object of a simulation study we have chosen a planar vertical 3 -pendulum with 
nonholonomic gears mounted on a unicycle. Parameter of the dynamic control algo- 
rithm d25t was equal to K = 200 • I4. The desired path for the manipulator (a circle ) 
was selected as 

ill 0) = 0.25 cos As + 1 [ra], 
n 2 (s) = -0.25 sin 4s + 0.6 [ra], 

and the desired path for the mobile platform was a straight line 

P(s) : x(s) = —s [ra], y(s) = — [ra]- 
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Fig. 5. Path tracking for the mobile platform: a) distance error L, b) orientation error 6, c) XY 
plot. 




Fig. 6. Path tracking for the 3 -pendulum: a) an error of x coordinate, b) an error of z coordinate, 
c) an error of orientation 0. 
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The initial posture of the platform was selected as (x, y, 0)(O) = (0, 2, 37r/4) and con- 
figuration of the manipulator was equal to (#i, #2, #3)(0) = (0, 0.6732, — 7r/3). 

Tracking of the desired path for the mobile platform has been presented in Fig. \5\ 
Parameters of the Samson algorithm were selected as v = 1, &2 = 0.1, ks = 1. Tracking 
of the desired path with prescribed orientation for the nonholonomic 3 -pendulum has 
been presented in Fig. [6] Tracking of the desired path without preserved orientation for 
the same manipulator has been presented in Fig. [71 





a) ° 1 2 3 4 ™ E 5 [S] 6 7 8 9 10 b) ° 1 2 3 4 ™ e 5 n 6 7 

Fig. 7. Path tracking for the 3-pendulum: a) XZ plot, b) a distance error l m . 

7 Conclusions 

In the paper the problem of defining the path for doubly nonholonomic mobile manipu- 
lators has been considered. We have proposed a new approach to the path as a geometric 
curve defined with the orientation or not. Path following problem with prescibed ori- 
entation is very important for mobile systems, especially for mobile manipulators - it 
results from the fact that it is impossible to realize a task, namely unload a payload if 
the platform has wrong orientation during the regulation process. 

In turn, for nonholonomic manipulator the desired path need not be defined with 
orientation. In such a case a new approach to the path following problem has been 
presented in the paper. A new control algorithm, guaranteeing not only asymptotic con- 
vergence to the desired path but simultaneously the motion along the path with nonzero 
velocity, has been introduced. It is possible to define another kinematic control algo- 
rithms with specific properties using Frenet parametrization. 
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Abstract. On-line determination of the basic frequency of an unknown periodic 
signal with an arbitrary waveform is crucial in imitating and performing rhythmic 
tasks with robots. We present a novel method to determine the basic frequency of 
a periodic signal on-line. The method is based on adaptive frequency oscillators 
in a feedback loop. While so far several adaptive frequency oscillators in a loop 
had to be used and the basic frequency determined using logical algorithms that 
choose from the determined frequency components, our method extracts the ba- 
sic frequency of the input signal without any additional logical operations. The 
proposed novel method uses a single oscillator combined with a whole Fourier se- 
ries representation in a feedback loop. Such formulation allows extracting the fre- 
quency and the phase of an unknown periodic signal in real-time and without any 
additional signal processing or preprocessing. The method also determines the 
Fourier series coefficients and can be used for dynamic Fourier series implemen- 
tation. The method can be used for the control of rhythmic robotic tasks, where 
successful performing of a task crucially depends on the extraction of the funda- 
mental frequency. We demonstrate the properties and usefulness of the method 
in simulation and on a highly nonlinear and dynamic task of playing the robotic 
yo-yo. 

Keywords: Adaptive frequency oscillators, Nonlinear dynamical systems, Fourier 
series, Imitation, Yo-yo. 



1 Introduction 

Rhythmic tasks, such as locomotion (7), drumming 0, handshaking [9| or playing 
with different toys, like the yo-yo [ 17] or the gyroscopic device called Powerball [6], 
require both accurate trajectory generation and frequency tuning. Controlling such tasks 
with robots is a difficult problem that requires complex sensory system and advanced 
knowledge of the task and the the actuated device 1 1 1 ]. 

The task itself can often be separated in determining the frequency of movement on 
one hand, and determining the waveform of the movement on the other. Determining the 
frequency of a task is a complex problem, even more so when the measured signal has 
several frequency components as is the case in imitating human movement. Extracting 
the fundamental frequency of a task can be achieved in different ways, e.g. with signal 
processing methods, such as FFT, or for example with the use of nonlinear oscillators 
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ifTUl . Furthermore, to perform the task with a robot, one has to constantly generate 
appropriate trajectories and modulate them on-line. A possible approach to trajectory 
generation and modulation is movement imitation lfT4l . which can be preformed in sev- 
eral ways and with different trajectory encoding methods, e.g. splines [16] or dynamic 
movement primitives (DMP) |[T5l . 

One of the approaches that successfully combines frequency extraction and trajec- 
tory generation is the use of a two-layered imitation system based on nonlinear dynam- 
ical systems [5]. In their work the authors explain that the imitation system can be used 
for extracting the frequency of the input signal, learning its waveform, and imitating 
the waveform at the extracted or any other frequency. Similar, but with less favorable 
properties of robustness against perturbations, trajectory generation and modulation can 
be achieved by using only the first layer of such system for both frequency extraction 
and waveform learning [ 13 1. Both systems, the one-layered and the two-layered are 
based on adaptive frequency oscillators in a feedback loop. Using adaptive frequency 
oscillators in a feedback loop can determine several frequency components of an input 
signal. Despite favorable properties of such systems there is a considerable drawback 
in determining the basic or fundamental frequency of the input signal. 

When dealing with complex periodic and pseudo-periodic signals with several fre- 
quency components, such as measurements of human movement, the first layer of the 
imitation system, referred to as the canonical dynamical system, has to include a high 
number of oscillators in the feedback loop. Using this system for movement imitation 
requires determining the basic of fundamental frequency. This is accomplished by a 
logical algorithm that follows the feedback loop [5 1. With a high number of oscillators, 
and when several of the oscillators tune to the same frequency, the decision making is 
not straight- forward anymore and becomes complex. As a high number of oscillators is 
practically necessary, determining the fundamental frequency from a list of determined 
frequencies cannot be avoided. 

In this paper we propose a novel design of the first layer (the canonical dynamical 
system) for the two-layered imitation system. Contrary to the original approach [21 the 
proposed approach does not require a logic algorithm to determine the fundamental fre- 
quency of the input signal. We use a single adaptive phase oscillator in a feedback loop, 
which is followed by a complete Fourier series approximation. An algorithm to deter- 
mine the Fourier coefficients is built into the series approximation. The combination of 
an adaptive phase oscillator and the adaptive Fourier series allows us to easily extract 
the fundamental frequency of the input signal and use it to control rhythmic robotic 
tasks. Our novel approach essentially implements a real-time adaptive Fourier series 
analysis. It is able to calculate the Fourier coefficients of an unknown periodic signal in 
real-time and is computationally inexpensive. 

The usefulness of this system is presented on the case of playing the yo-yo. Control- 
ling the yo-yo with a robot is a difficult task and has already been a subject of several 
studies 11718141 which mostly rely on complex, specially designed controllers based on 
the models of the device. The task of playing yo-yo is highly non-linear and requires 
on-line frequency adaptation. The proposed approach simplifies the synchronization 
between the necessary upward jerk of the robot and the movement of the yo-yo by de- 
termining the frequency of the up-down motion from a measurable periodic quantity, 
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Fig. 1. Two-layered structure of the imitation system. The input yi n is a measured quantity and 
the output is the desired trajectory x of the robot. The input fa is the additional phase lag and 
r is the amplitude of the trajectory. The system can work in parallel for an arbitrary number of 
dimensions. 

which is either the measured force of the impact, or the length of the unwound yo-yo 
string. 

In the rest of the paper we give in section 2 a brief description of the original two- 
layered imitation system with an emphasis on the first layer - the canonical dynamical 
system. In section 3 we describe the novel approach using the Fourier series in the 
feedback loop. In section 4 we evaluate the proposed approach in both simulation and 
on a real- world experiment of playing the yo-yo. Conclusions and summary are given 
in section 5. 



2 Two-Layered Imitation System 

The two-layered imitation system was presented in detail in Q. In their work the au- 
thors explained that the system can be used for extracting the frequency spectrum of the 
input signal, learning the waveform of one period, and imitating the desired waveform 
at an arbitrary frequency. The system structure is presented in Fig. Q] The first layer, 
i.e. the canonical dynamical system, is used for frequency extraction. It is based on a 
set of adaptive frequency oscillators in a feedback loop. The second layer is called the 
output dynamical system and is used for learning and repeating the desired waveform. 
The latter is based on dynamic movement primitives - DMPs, e.g. fi31 . 

The first layer of the system has two major tasks. It has to extract the fundamental 
frequency Q of the input signal and it has to exhibit stable limit cycle behavior in order 
to provide the phase signal <£. The basis of the canonical dynamical system is a set 
of adaptive phase oscillator with applied learning rule as introduced in (H. In order 
to accurately determine the frequency it is combined with a feedback structure (2), as 
is presented in Fig. [2] The feedback structure of M adaptive frequency oscillators is 
governed by 



fa = uji — K • e • sin fa , 
Cji = — K • e • sin^, 

e = y in - y, 

M 

V = ^2 aj cos fa, 

i=l 

OLi = T] ■ e • cos fa , 



(1) 

(2) 
(3) 

(4) 
(5) 
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Fig. 2. Feedback structure of M nonlinear adaptive frequency oscillators. Note the logic algorithm 
that follows the feedback loop. 

where K is the coupling strength, <j>i,i = 1...M is the phase of separate oscillators, yi n 
is the input signal, M is the number of oscillators, oti is the amplitude associated with 
the z-th oscillator, and r] is the learning constant. 

As shown in Fig. [2 each of the oscillators in the feedback structure receives the same 
input, i.e. the difference between the input signal and the weighted sum of separate 
frequency components. Such a feedback structure preforms a kind of Fourier analysis. 
The number of extracted frequencies depends on how many oscillators are used. As only 
the fundamental frequency is of interest, the feedback structure is followed by a logic 
algorithm. Determining the correct frequency and the phase is crucial, because they are 
the basis for the output dynamic system and the desired behavior of the actuated device. 

When choosing the fundamental frequency one possible approach is to choose the 
first non-zero frequency as was presented in [5 1. However, it has a drawback that when 
more than one oscillators converge to, or oscillate, around the same frequency, the logic 
algorithm switches between the oscillators, and consequently the phase will not be 
smooth, leading to oscillations in the output trajectory. 

3 Canonical Dynamical System Based on Fourier Series 

In this section a novel architecture for canonical dynamical system is presented. As the 
basis of the canonical dynamical system we use a single phase oscillator with applied 
learning rule |T|. This is combined with a feed-back structure based on an adaptive 
Fourier series in order to accurately determine the frequency. A feedback structure with 
an adaptive frequency oscillator combined with an adaptive Fourier series is shown in 
Fig. [3] The feedback structure of an adaptive frequency phase oscillator is governed by 



4> = Q — K • e • sin c 
Q = —K • e • sin(/>, 
e = Vin- y, 



(6) 

(7) 
(8) 
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Fig. 3. Feedback structure of nonlinear adaptive frequency oscillator combined with dynamic 
Fourier series. Note that no logic algorithm is needed. 

where K is the coupling strength, <\> is the phase of the oscillator, e is the input into the 
oscillator and jji n is the input signal. If we compare Eqs. ([HO and Eqs. ([61Q, we can 
see that the frequency Q and the phase </> are now clearly defined. The feedback loop y 
is now represented by the Fourier series 



y = a + y^(a^ cos(i0) + A sin(i0)), 



(9) 



and not by the sum of separate frequency components as in Eq. [4] M is the size of the 
Fourier series and ao is the amplitude associated with the first segment of the series. It 
is updated by 

d = V • e, (10) 

where r] is a learning constant. The amplitudes associated with the other terms of the 
Fourier series are determined by 



di = r]cos(i(j)) • e, 
A = r?sin(i(/>) • e, 



(11) 
(12) 



where i = 1...M. As shown in Fig. [51 the oscillator of the feedback structure receives 
the difference between the input signal and the Fourier series. Since a negative feedback 
loop is used, the difference approaches zero when the Fourier series representation ap- 
proaches the input signal. Such a feedback structure preforms an adaptive Fourier anal- 
ysis, where the phase difference between the harmonics can only be 0, 7r/2, it or 3tt/2. 
This is not the case in the original approach lfT2l . where the phase difference can be 
completely arbitrary. 

The proposed approach has the ability to adapt to the basic frequency of the input 
signal. The number of harmonic frequency components it can accurately extract de- 
pends on how many terms of the Fourier series are used. Since in this structure only 
one oscillator is used and the harmonics are encoded in the Fourier series, the basic 
frequency and phase are clearly defined. This is an important improvement, especially 
for the usefulness of the imitation system when performing rhythmic tasks. 

The new architecture of the canonical dynamic system can be used as an imitation 
system by itself, as it is able to learn arbitrary periodic signals. After convergence, e 
reaches zero (with an accuracy that depends on the number of elements of the Fourier 
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series). Once e is zero, the periodic signal stays encoded in the Fourier series. The 
learning process is embedded and is done in real-time. There is no need for any external 
optimization process or learning algorithm. 

Adding the output dynamical system enables us to synchronise the motion of the 
robot to a measurable periodic quantity of the task we would like to preform. The mea- 
sured signal is now encoded into the Fourier series and the desired robot trajectory is 
encoded in the output dynamic system. Sice adaptation of the frequency and the learn- 
ing of the desired trajectory can be done simultaneously, all of the system time-delays 
can be automatically included. Furthermore, when a predefined motion pattern for the 
trajectory is used, the system time-delays can be adjusted with a phase lag parameter 
4>i . This enables us to either predefine the desired motion or to teach the robot how to 
preform the desired rhythmic task online. 

The output dynamical system also ensures greater robustness against perturbations 
and smooth modulation. Specially greater robustness to perturbation is crucial when 
performing fast, dynamic tasks. 

4 Evaluation 

In the following section we evaluate the proposed imitation system. In Section I47TI the 
numerical results from the original and the novel architectures are presented. In Section 
14. 21 a real- world experiment of playing the yo-yo with the use of the proposed imitation 
system is shown. 

4.1 Simulation 

In this numerical experiment the proposed architecture for the canonical dynamical sys- 
tem learns an arbitrary signal. The populating of the frequency spectrum is done without 
any signal processing, as the whole process of frequency extraction and adaptation of 
the waveform is completely embedded in the dynamics of the adaptive frequency os- 
cillator combined with the adaptive Fourier series. Unless stated otherwise, we use the 
following parameters: \i = 2, K = 20, M = 10. 

Frequency adaptation results from time- and shape- varying signals are illustrated 
in Fig. ID The input signal itself is of three parts: a periodic pulse signal, a sinusoid, 
and a sawtooth wave signal. Transitions between the signal parts are instant for both 
frequency and waveform. We can see that after the change of the input signal, the output 
frequency stabilises very quickly. 

A single adaptive frequency oscillator in a feedback loop is enough, because the 
harmonics of the input signal are encoded with the Fourier series in the feedback loop. 
As can be seen from the bottom plots in Fig.[4l the input signal and the feedback signal 
are very well matched. The approximation error depends only on M. 

A comparison with the original approach as proposed in Q is given in Fig. In 
their approach, if there are not enough oscillators to encode the input signal, the system 
will only learn the frequency components with more power. Thus, the output signal will 
only be an approximation. However, if there are more oscillators than the frequency 
components to learn, either some of them will not converge to any frequency or the same 
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Fig. 4. Typical convergence of an adaptive frequency oscillator combined with an adaptive Fourier 
series, driven by a periodic signal with different waveforms and frequencies. Frequency adapta- 
tion is presented in the top plot and the comparison between the input signal y and the approxi- 
mation y in the bottom plot. 

frequency components will be coded by several oscillators, as shown in the top plot in 
Fig.[5j where a pool often oscillators was used. In this particular experiment, five of the 
oscillators converge to the basic frequency of the signal. Choosing the right oscillator 
from that pool is a very difficult task and requires a complex logic algorithm. On the 
other hand, using our new approach, where the feedback is encoded with a Fourier 
series, the oscillator converges to the basic frequency of the input signal. Therefore, the 
basic frequency and the phase are clearly defined. Furthermore, the approximation and 
the convergence of the feedback signal is quicker, as it is shown in the bottom plots in 
Fig. [5] Even after 350 s, the original architecture from [12] did not produce as good an 
approximation as it was after 20 s when using our new proposed canonical dynamical 
system. 



4.2 Application to Robotic Yo-Yo 

To illustrate the usefulness of the proposed approach we implemented it on a real robot 
playing a yo-yo. Playing yo-yo with a robot can be achieved in different ways, depend- 
ing on what one can measure. It can be the length of the unwound string, which can 
be effectively measured by a vision system. As described in [17], using vision is also 
one of the ways humans do it, even though approaches using only the measured force 
were also described [8]. With our proposed system, playing yo-yo can be accomplished 
either with force feedback or with visual feedback. Furthermore, the proposed system 
is able to synchronize even if the input signal is changed from one measurable quantity 
to another during the experiment. 

We preformed the experiment on a Mitsubishi PA- 10 robot as presented in Fig. [6] 
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Fig. 5. Comparison between the pool of the adaptive oscillators and our proposed approach. The 
first plot shows evolution of frequency distribution using a pool of 10 oscillators. The middle 
plot shows the extracted frequency using an adaptive frequency oscillator combined with Fourier 
series. The comparison of approximated signals is presented in the bottom plot. The thin solid 
line presents the input signal, the solid line presents our new proposed approach and the dotted 
line presents the pool of adaptive oscillators. 
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Fig. 6. Experimental setup. 



A JR3 force sensor was attached to the end effector to measure the impact force of the 
yo-yo hitting the end of the string, and a USB camera was used to measure the length 
of the unwound string. 

The two layered imitation system with the novel canonical dynamical system was im- 
plemented in Matlab/Simulik. The control scheme is presented in Fig. [7] As we can see, 
the imitation system, based on a nonlinear oscillator combined with dynamic Fourier 
series, provides the desired trajectory for the robot with a yo-yo attached at the top. The 
motion of the robot was constrained to up-down motion using inverse kinematics. The 
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Fig. 7. Proposed two-layered structure of the control system for controlling the peak height of the 
yo-yo. The input is either the force yf or the visual feedback y v . 
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Fig. 8. Pre-defined hand motion pattern for playing yo-yo (solid) and encoded into the output 
dynamic system (dashed). 



length of the string or the force from the top of the robot can be used as the input into 
the system. Since a measurable force difference appears only as a spike when the yo-yo 
hits the end of the string, we modify the signal in a way that it carries more energy. In 
our particular case we use the measured spike to create a short pulse. 

To preform the task, we first determined the waveform of the required motion pattern. 
We chose the motion pattern described in ifTTlL which satisfies the required criteria for 
playing the yo-yo. The robot motion pattern encoded into the output dynamic system 
(dashed line), and the pre-defined motion pattern (solid line) are presented in Fig [8] 

The frequency of the task depends on the parameters of the yo-yo itself, and on how 
high the yo-yo rolls up along the string. The height can be influenced by the amplitude 
of the robot (or hand) motion, which can be easily modified using the amplitude param- 
eter r of the motion, see Fig. [7] A PI controller was used to control the peak height of 
the yo-yo. The controller is given by 
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r(t) = k p h(t) + h h(t)dt, 



(13) 



where k p = 2, and ki = 0.4 were determined empirically. Fig. [9] shows the results of 
frequency adaptation and yo-yo height during the experiment. 

As we can see, the frequency of the imitated motion quickly adapted to the motion 
of the yo-yo and stable motion was achieved. At approximately 52 s the input into the 
imitation system was switched from force feedback to visual feedback. At that point 
some oscillation in the frequency and the approximation of the input signal can be 
observed because they have to adapt to the new waveform of the input signal. Further- 
more, from the middle sequence in Fig.[l0|we can see that the amplitude of the robot 
motion is higher after switching from the force feedback to the vision feedback. Despite 
the change, the imitation system still manages to extract the correct frequency and the 
robot motion returns to steady-state oscillations. Note that in the bottom sequence in 
Fig.[l0]the amplitude of robot motion is smaller than immediately after the switch. 

As far as we know, this is the first system which has the capability of playing the yo- 
yo by force feedback or by vision feedback, without changing the system parameters. 
Furthermore, switching from one to another measured quantity can even be done during 
the experiment. This shows that the proposed system is adaptable and robust. 




40 50 60 

Time [s] 



100 



Fig. 9. Robot trajectory x in the top plot, height h of the yo-yo in the second plot, extracted 
frequency in the third plot and signal adaptation y in the bottom plot. At 52 s the input signal is 
switched from force feedback to visual feedback. Yo-yo parameters in this case are: axle radius 
r a — 0.01 m and mass m = 0.2564 kg. 
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Fig. 10. Image sequence of a robotic yo-yo. The top sequence presents the descent of the yo-yo 
from the gripper. In the middle sequence the behaviour of the system after switching from force 
feedback to vision feedback is shown (£1 = 54 s) and in the bottom sequence the behaviour in 
steady state is presented (£2 = 81 s). 

5 Conclusions 

We presented a new architecture for the canonical dynamical system which is a part of 
a two layered imitation system, but can be used as an imitation system by itself. The 
dynamical system used to extract the frequency is composed of an adaptive phase os- 
cillator combined with a Fourier series. This system essentially implements an adaptive 
Fourier series of the input signal. It can extract the frequency, the phase and the Fourier 
series coefficients of an unknown periodic signal. This is done in real-time and without 
any additional processing of the input signal in the sense of determining the frequency 
or setting processing parameters. Integrating this system into the imitation system based 
on dynamic motion primitives enables simple and computationally inexpensive control 
of rhythmic tasks with at least one measurable periodic quantity. 

We presented the use of the proposed imitation system to preform a rhythmic robotic 
task that requires synchronization with the controlled device. For playing the yo-yo we 
have shown that the information on how high the yo-yo rolls up along the string or sim- 
ply the force feedback is enough to achieve stable performance. The proposed approach 
enables playing yo-yo by measuring either the force or the yo-yo position. Furthermore 
we also showed that the system has the capability of changing the measured quantity 
in a single experiment without loosing the synchronization between the robot and the 
yo-yo. 
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Abstract. It is well known that for bilateral teleoperation, force feedback in- 
formation is needed. In this paper, we propose a control approach for bilateral 
teleoperation with uncertainties in the model of the slave robot and which does 
not use force sensors for haptic feedback. The controller design is based on a 
cyclic switching algorithm. In the first phase of the cyclic algorithm, we estimate 
the environmental force acting on the slave robot and in the second phase a track- 
ing controller ensures that the position of the slave robot is tracking the position 
of the master robot. A stability analysis of the overall closed-loop system is pre- 
sented and the approach is illustrated by means of an example. 

Keywords: Bilateral teleoperation, Force sensor-less robotic setups, Haptic 
feedback. 



1 Introduction 

In this paper, we consider the problem of bilateral teleoperation in force-sensor-less 
robotic setups. It is well-known that haptic robotic devices and teleoperation systems 
exploit information regarding the external forces (see U and O), e.g. for haptic feed- 
back. The slave robot interacts with the environment and its dynamics are dependent on 
external forces induced by this interaction. These forces can be contact forces (interac- 
tion forces between environmental objects and the robot) or exogenous forces induced 
by the environment. 

In bilateral teleoperation, knowledge on the unknown environmental force applied on 
the slave robot is typically needed to achieve coordinated teleoperation. One option for 
obtaining such disturbance information is to equip the slave robot with force- sensors; 
for examples of such robotic devices, especially haptic devices, which use force sensors 
the reader is referred to Q, 0. However, in many cases, the most important external 
forces for multi-link robots appear at the end-effector. Note that force sensing at the 
end effector of the robot is often not feasible since the external forces will typically 
interact with the load, which the slave robot is positioning, and not directly with the 
robot end-effector. Besides, in some cases, the position at which the external forces are 
applied is a priori unknown and may be on a robot link as opposed to on the end-effector. 
Moreover, the usage of force-sensors can be expensive and increase the production costs 
of the robot which can be undesirable especially in domestic applications. 

J.A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 197 J210. | 
springerlink.com © Springer- Verlag Berlin Heidelberg 2011 
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For these reasons, the usage of a disturbance estimation scheme for force- sensor- 
less robots for the purpose of haptic feedback can be interesting. Disturbance observers 
(DOB) have been widely used in different motion control applications (El, BlL 161) for 
determining disturbance forces, such as friction forces. However, the performance en- 
hancement of these DOB strategies may lead to smaller stability margins for the motion 
control (| 7 1); therefore, a robust design with respect to the environmental disturbances 
and model uncertainties is needed. Previous results on robustly stable DOB ([8], [9], 
|[TQlL |[TT1 ) are based on linear robust control techniques. Some nonlinear DOB have 
been developed for the estimation of harmonic disturbance signals (021, H3). 

Various strategies have also been considered for force-sensor-less control schemes 
estimating the external force. [14] proposes an adaptive disturbance observer scheme, 
and |[T5l and lfT6l propose an H°° estimation algorithm. In ifTTlL a control strategy 
called "force observer" is introduced. This design uses an observer-type algorithm for 
the estimation of the exogenous force. The drawback of this approach is that it assumes 
perfect knowledge of the model of the system. 

In parallel with force estimation strategies, based on disturbance observers, another 
approach using sensor fusion has been developed to diminish the noise levels of the 
force sensors. In [18], force and acceleration sensors are used, while in [19], data from 
force sensors and position encoders are fused. Sensor fusion provides better qualitative 
results than obtained by employing more expensive force sensors. 

Here, we present a control approach for bilateral teleoperation with an estimation 
strategy for external forces acting on the slave robot with a load with unknown mass. 
The difficulty in this solving this problem is that due to the uncertainties in the mass of 
the load it is possible for the previously referenced algorithms to estimate the exogenous 
force acting upon the slave robot. The method introduced in this chapter extends a result 
presented in [20], which considered the human-robotic co-manipulation problem. The 
proposed algorithm is robust for large uncertainties in the mass of the load. 

The chapter is structured as follows. Section[2]presents the problem formulation and 
in Section[5]we describe the control strategy we propose. In SectionlU we apply the al- 
gorithm to a master-slave robotic setup. In the final section of the paper, the conclusions 
and some perspectives on future work are discussed. 

2 Problem Statement 

The problem that is tackled in this paper is that of bilateral teleoperation in force sensor- 
less robotic setups. We assume that the slave robot is generally carrying a load (e.g. tool 
or product) and that the exogenous forces act on the slave or on the load. We consider 
the case in which no force sensor is present to measure the exogenous force directly. 
Moreover, we consider the realistic case in which the mass of the load is not known 
exactly which further challenges the estimation of the exogenous force. In order to 
solve this problem, we propose the design of a force estimator which is robust to the 
uncertainties in the mass of the load. In order to achieve teleoperation, the position of 
the slave robot must track the position of the master robot. 

In FigureQ] the block diagram of the teleoperation setup is presented with the blocks 
Master and Slave representing the dynamics of the master and the slave robot, respec- 
tively, and the block C representing the control algorithm for bilateral teleoperation. 
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Fig. 1. Problem Setup. 

The signals Fh and Fe represent the human and the environmental force, respectively; 
xm and xs are the positions of the master and the slave robots and Fm and Fs are the 
control signals for the master and slave robots, respectively. We adopt the assumption 
that the only measurements available are the position of the end-effectors and hence we 
aim to construct an output- feedback control strategy. 

Consider the dynamics of the master and slaves robots dynamics: 

f m M x M = F H (t) +F M (t) m 

\msxs = F E (t)+F s (t) ' Uj 

where %, xs G M 3 are the position of the master and the slave robots, respectively, 
Fh ,Fe € M 3 are the human and the environmental force, respectively, tum is the 
unknown inertia of the master robot (including some unknown inertia of the human 
hand) and ms is the unknown inertia of the slave with the load (the masses are assumed 
to be bounded by m M , rn s 6 [M min , M max \). 

The objective of this work is to design the controller C such that the following goals 
are met: 

- the position of the slave robot is tracking the position of the master robot; 

- an accurate estimate of the environmental force is transmitted to the master robot; 

- the overall system is stable. 

3 Control Design 

Due to the uncertainties in the model of the slave robot we can not estimate the un- 
known environmental force and track the master robot position at the same time (the 
unknown inertia of the load and the fact that only position measurement data is avail- 
able do not allow for simultaneous force estimation and position tracking). Therefore, 
we are proposing a switching controller based on a cyclic algorithm. During one cycle 
of duration T, we will have two phases as in Figure [2} 

1 . estimation of the environmental force; 

2. position tracking. 

During the first phase, which last for a period of To (To < T), the controller will 
behave as a force estimator. Here we are using the force observer introduced in l20l 
to estimate the external force which will be used for the purpose of haptic feedback 
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Fig. 2. Temporal division of the control strategy. 
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Fig. 3. Controller Design. 

and during the second phase we are keeping the estimated force constant, Fm (t) = 
F E (t) = Kqx s 2snAF s {t) = -K x s , for kT < t < kT + T , k G N, where parameter 
Ko is a scalar parameter in the force estimation algorithm and is chosen such that the 
settling time of the estimation of the force is smaller than To. In the second phase, 
we are using a PD controller for the slave robot to track the position of the master 
robot, F M (t) = F E (kT + T ) and F s (t) = K x {x M ~ x s ) + K 2 (x M - x s ), for 
kT + To <t< (k + 1)T, k 6 N, where scalars K\ and K 2 represent parameters of the 
PD controller that ensures the tracking of the master robot position by the slave robot 
(these parameters are chosen such that the polynomial mss 2 + K 2 s + K\ is Hurwitz 
\lms G [M m i n , M max ] ) and Fe {KT + To) is the estimation of the environmental force 
at the end of the first phase. In Figure we present the block diagram representation 
of the controller where the controller blocks are represented by their transfer functions 
in the Laplace domain (s G C) and the block called Memory saves the last estimate 
of the environmental force at the end of the first phase and provides the same constant 
output during the entire second phase.This means that (estimation phase) and (tracking 
phase). The switches in Figure [3] are set on positions corresponding to the first phase of 
the algorithm. 

In the following section, we study the stability for the closed-loop system (including 
force estimation error dynamics and tracking error dynamics). 



3.1 Description of Model Dynamics 

For the purpose of stability analysis, we first formulate the model of the error dynamics. 
In order to obtain the error dynamics, the dynamics of the master and slaves robots are 
needed in both phases. During the first phase (kT < t < kT + To, k G N), the model 
dynamics are: 
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( m M x M = F H (t) + Kqx s (2 , 

In the second phase of the algorithm (kT + T < t < (k + 1)T, k e N), the system 
behavior is described by: 

f m M x M = F H (t) + F E (kT + T ) (3) 

\ m s x s = F E (t) + Ki(x M ~ x s ) + K 2 (x M - x s ) 

In the sequel, we assume that the exogenous forces acting on the system (human force 
Fn(t) and environmental force F E (t)) and their derivatives are bounded. 

Given the fact that the same world frame is considered for both master and slave 
robots, the 3D system described by relations and d3} can be seen as a set of three 
decoupled 1-DOF systems along each space dimension. Since the behavior for the three 
1-dof system is similar, for simplicity, we are going to consider in the sequel only 1- 
DOF system, knowing that the extension for 3D systems is trivial. 

3.2 Stability Analysis 

Let us define the vector of tracking error coordinates e = [e Xl e x ) T = [%(t) — 
xs(t)jXM(t) — xs(t)] T , which contains both the position and the velocity tracking 
errors, and the force estimation error e E = F E — F E . Then the force estimation error 
dynamics are described by: 

e F = -e F -F E , (4) 

m s 

during the first step of the algorithm (kT < t < kT + T , k G N) and 

e F = -F E , (5) 

during the second phase (kT + T < t < (k + 1)T, k G N). 
The tracking error dynamics is represented by: 

01\ / \ fF H \ ( 



k = n n e + -JL_ _JL_ 77 " + _JL_ i JL ) e F, (6) 

\ uu / \m M m M / V E / \m M m s 

for t e [kT, kT + T ), with k 6 N and 

1 \ / \ /F^\ / 



Ei. _^2. 1 e + ( _J l 3- J ( F ) + I -±- I eF ' ^ 7) 

m s m s / V m M rn M rn s / \ E / V rriM / 

for t e [kT + T , (fc + 1)T), with fc 6 N. 

The goal of this section is to prove that the composed estimation and tracking error 
dynamics ©-(IT]) are input- to- state stable with respect to the inputs Fh , F E and F E . For 
this we are going to use a result introduced in |2H that states that the series connection 
of two input- to- state stable systems is also an input- to- state stable system. In the sequel, 
this proof will be split into two parts: 

- first, we prove that the force error dynamics ©, $5§ are input- to- state stable with 
respect to the input F E ; 

- secondly, we prove that the tracking error dynamics ©, © are input- to- state stable 
with respect to the inputs Fh, F e and e E . 



202 S. Lichiardopol, N. van de Wouw, and H. Nijmeijer 

Input-to-State Stability of the Force Estimation Error Dynamics. The stability 
analysis of the force estimation error dynamics is done by studying an exact discreti- 
sation of the system ©, $5§ at the sampling instances kT. Let us now construct such a 
discrete-time model. 

The solution of system © at time t = kT + T , with k G N, is: 

e F (kT + T ) = e~^ T °e F (kT) + + £° e~^ {T °~ T) F E (kT + r)dr. (8) 
The solution of system ([5]) at time t = (k + 1)T, with fc G N, is: 

e F ((fc + 1)T) = e F (&T + T ) - J T " T ° F^fcT + T + r)dr. (9) 

Define the sampled force estimation error by e& := e F (kT), with fc G N. Combining 
relations ® and ([9]), one can obtain the discrete-time force estimation error dynamics: 

e/c+i = e~^ To e k + iu fc , (10) 

with^ fc = J To e~^ iTo ~ r) F E (kT + T)dr- J^~ T ° i^; (/cT + T + r)dr. The system 

< 1, since 



_iio_ To 



dTQb is input- to- state stable with respect to the input Wk because 

the parameters Kq, Tq and the inertia ms are positive. Note that Wk is bounded for any 

bounded F F (t) and bounded To. 

Now we exploit a result in [22] that says that if the discrete-time dynamics is ISS and 
the intersample behavior is uniformly globally bounded over T, then the corresponding 
continuous-time dynamics is ISS. The fact that the intersample behavior is uniformly 
globally bounded over T directly follows from ©, © with F F bounded, since 

e(t) = { e-^- kT) e F {kT) + f hT e-^-r)p E{r)dT , kT < t < kT + T 
Cv; \e F (kT + T )-fl T+To F E {r)dT, kT + T <t < (k + 1)T 

(11) 

Input-to-State Stability of the Tracking Error Dynamics. Similarly to the study of 
the force estimation error dynamics, we evaluate the input- to- state stability property of 
the tracking error dynamics with respect to the inputs Fh, F f and e F . 
The solution of system © at time t = kT -f To, with k G N, is: 

e(kT + T ) = e A ^e(kT) + J^° e A ^ T °-^ B n u(kT + r)dr 

+ / To e^( T o- T )Bi 2 e F (fer + r)dr, ( j 

where A x = ( j, Bn = f _j i_ J, S i2 = ( _j_ j_ ) and u(t) = 

V / V m M rn M / V rn M rns J 

[F H (t)\ 
{F E (t)J- 
The solution of system © at time t = (k + 1)T, with fceN, is: 

e((k + 1)T) = e A ^ T - T ^e(kT + T ) + / r_r ° e A ^ T - T °- T ^B 21 u{kT + T + T )dr 

+ !o~ T ° e A ^ T - T °-^B 22 e F (kT + T + r)dr, 

(13) 
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where A 2 = ( Kl _ k 2 J , B 21 = ( _j i i ) and B 22 = ( 1 



ras ras / \ m M rn M rns / \ rn M 

Let us define 

c^ := e A2 ( T - To H/o T0 e Al(To " T) 5ii^(/cT + r)dr + f^ e A ^ T °~^ B 12 e F (kT + r)dr) + 

JT-To e M(T-T -T) B2iU ( kT + Tq + r)dr + JT-To e A^T-T -r) B22eF ^ T + ^ + ^ 

(14) 
and e/e := e(fcT), with fc G N. Combining relations (1121) and dl3l> ^ we obtain the follow- 
ing discrete-time error dynamics: 

e fc+1 := e A ^ T - T °)e AlTo e fc + w fc , (15) 

where ujk is bounded for all fc, since T, To are bounded, T^, T# are bounded by as- 
sumption and ep is bounded due to the fact that the force estimation error dynamics is 
ISS with respect to F E . 

Next, we study the input- to- state stability property of the system (ITSb with respect 
to the input Uk. But before we carry out this step, we need to evaluate the matrix 
Q = e M(T -Tq) e A x T Q Namely, input- to- state stability of (fT5l) implies, firstly, the global 
uniform asymptotic stability of e = when the input Uk is zero and the boundness of 
the error e for bounded inputs uok- 

For the evaluation of the matrix Q, two exponential matrices must be determined; 
as the matrix AiTq depends only on known parameters, we can easily determine its 
exponential: 

._ ^A 1 T _ f 1 T 



E 1 :=e^^ = l Q ^\. (16) 

In order to compute the exponential of matrix P = A 2 (T — To), we are using a pro- 
cedure similar to the one introduced in [23 ], which employs the Cay ley-Hamilton theo- 
rem, which says that if p(A) = det(A/ n — A), with I n the n x n identity matrix, is the 
characteristic polynomial of a matrix A G R nxn then p(A) = 0. This means that given 
the matrix P, for any i > 2, there exists a set of coefficients a^bi el such that the i th 
power of P can be expressed in terms of its first two powers: 

P l = a z I n + biP. (17) 

Let us now exploit dTTb to determine the exponential of the matrix P: 



00 pi °° | 



e^ = y — = y -Mih + hP), (is) 



=0 i=0 



or 




(19) 
Using the expression of A 2 , we can decompose P as follows: P — U -\ — — T, where 

and „ „ 

-K 1 (T-To)-K 2 (T-T ))- (21) 
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Consequently, the expression for the exponential matrix becomes: 



Hm'Hm u+ Hm L - «• 



Let us now define the following scalars: 



a = mm 



m s e[M rnin ,M max ] \~* l\ 
\i=0 






a = max 



Ell (^) 



and 



m S £[Mm in ,M m ax] \~^ l ' 

00 h 

13= min [ V 4 ) , (25) 

/ 00 7 \ 

/? = max f V 4 • (26) 

m€[M min ,M max ] \ ^^ 2! / 
\i=0 / 

Given the fact that ms G [M min , M ma J, we can define the scalars 7 = -^ — and 

__ i 



Then there always exist ("1, C2, (3 G [0, 1] such that: 



E| =Cia+(l-Ci)5, (27) 



=o* ! 



OO , \ 

^^ =(2^+(l-(2)^ (28) 



,i=0 % 



and 

— =C37 + (1-C3)7- (29) 

Introducing relations d27k d28l) and d29l ) into expression (l22l) leads to: 



= (Ci« + (1 - Ci)«) ^2 + (C 2 ^ + (1 - C 2 )/3) tf 

+ (C37 + (1 - Cs)i) {(2P + (1 - c 2 )/3) l, 



(30) 



for some Ci,C2,C3 € [0,1]. _ 

Let us define the matrices A = 3aEi, T 2 = 3aE l9 T 3 = 3§UE l9 A = 3/3UE l9 
T 5 = 3§jLE l9 r 6 = 3pfLE l9 r 7 = 3~J3j_LE 1 and T 8 = 3~J3*fLE l9 and the scalars 

n - Cl no - Wl no - ^ 2 n„ - X ~& n - C2C3 . _ C2(l-Ca) . _ (I-C2K3 
£l — -3-^2 — —3— , £3 — -3-, QA — —3—' Q5 — —3—, £?6 — 3 , Ql — 3 » 

gs = ~ 3 ~ • This means that the expression of matrix Q is equivalent to: 

8 

Q = Y,Q* r u ( 31 ) 
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with Z)i=i Qi = 1 - 

Thus we have now found the generators for a convex (polytopic) matrix set that 

overapproximates the matrix set Q, with the uncertain parameter ms 6 [M min , M max ] . 
Notice that J2iLo ff an d Xl^o if are infinite sums and will in practice be approximated 
by finite sums of length N. Next, we provide an explicit upper bound on the 2-norm of 
the approximation error induced by such truncation. 

Theorem 1. Consider an integer TV £ N and a real positive scalar d such that 
- /jl— J^f^- < 1, where 



max \eiq(P T P)\ 

mse[M min ,M max } L J 



(32) 



- Mi > N, V^ < il 
Then: 

Proof. 

Eoo pi_ < ^c 
i=N i\ n — 2-^i 



00 pi 

2-^ 7T 



--N 



i=N 



P % 



< 



fl 



N 



<n 



\ P \\2 < V^OO y{\ma X y 



(33) 



(34) 



where the inequality ||^|| 2 < ||A|| 2 x . . . x ||A|| 2 = max(eig((A T A)) 1 has been used. 
Using the property that Ma e M + , 37V e N such that Mi > N, VaJ < il, inequality (l34t 
becomes: 



00 pi 

i=N 



<-±^ x 



>.\ 



< 



J2^- 



(35) 



2 z=iV i=N 

Let us now employ the known result of convergence of geometric series which states 

l-a n+1 



that Va e [0, 1), lim n _>oo Yl7=o a% = lim n^c 



E 

i=N 



< 



1-a 

i-m' 



1 

1-a' 



(36) 



a 



Using TheoremQ] we can choose N such that the approximation error is small (even as 
low as the machine accuracy). 

In the next theorem, we provide LMI-based stability conditions for the discrete-time 
tracking error dynamics (fT5l) to be ISS with respect to the input ujk- 

Theorem 2. Consider the discrete-time system fl5i . If there exists a matrix Q = Q T > 
and scalar $ E (0,1), such that the following linear matrix inequalities are satisfied: 



P^QPi -Q< -qQ, i E {1, . . . , 8} , 



(37) 



where P{ are defined above, then the system rf73T ) is ISS with respect to the input LUk- 
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Proof: Using the Schur complement, relations d37b can be written as: 

on F -o)^-^ ie ^--^- (38) 



Multiplying every inequality (1381) with Qi and summing them up, we obtain: 

which according to equation d3TT> is: 

-J?Q T J? 



or 

Q T f2Q -Q< -<;Q. (41) 

Let the candidate ISS-Lyapunov function be V k = (e/ c ) T i?e/ c . We compute AT4 = 

AV^ = (e fc ) T Q T ^Qe fe - (e fe ) T ^e fe + 2(e fe ) T Q T ^ fc + {u> k ) T (2u k , (42) 

which according to fiTT) gives: 

AFfc < s(e k ) T ne k + 2(e /e ) T Q T ^ /e + {u k ) T (2u k (43) 

After some straightforward computations, we can show that: 



\e 



2 >2 /^ supM ^ Ay <_||| 6 ||2 5 (44) 

where A ma;E and A m i n are the largest and the smallest eigenvalues of matrix i?, respec- 
tively. 

(l44t implies that system (TBI) is input- to- state stable with respect to the input u k ', see 
[24 ] for sufficient conditions for the ISS of discrete-time systems. 



Remark 1. For the sake of simplicity, Theorem \2\ is based on a common quadratic 
ISS Lyapunov function V = e T f2e. Alternatively, a parameter-dependent Lyapunov 
function approach could straight-forwardly be exploited to formulate less conservative 
stability conditions. 

The LMIs d37b are defined for the non-truncated i~i, but in practice we evaluate the 
vertex matrices using a truncation after TV iterations as provided by Theorem Q] The 
errors can be ensured to be as low as the machine accuracy by choosing N large enough, 
just as the errors obtained from the numerical solver of the LMIs. 

The last part of the study of the ISS property of the tracking error dynamics is to an- 
alyze the intersample behavior. Using Theorem^ we can prove that the error dynamics 
are ISS on the sampling instances t = kT, with k G N. Given the choice of the param- 
eters K\ and K2 such that the system © is Hurwitz for all ms E [M m i n , M max ] 9 we 
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can conclude that during the second phase it G [kT + To, (k + 1)T)) the evolution of 
the tracking error is bounded. In order to prove the stability of the overall continuous- 
time system, we need to show that the position error dynamics are also bounded for 
te (kT,hT + T ). 

The solution of system ©, for t G (kT, kT + To), is given by: 

lt\ /7 _ N r t (\ t- 



e(kT + t) 



1 



+ fo 



e(kT) 

lt-T 
1 



J o \ o 1 
B 12 e F (kT ■ 



Bnu(kT + r)dr 
r)dr, 



(45) 



with t G [0,To]. As the human force and the environmental force are bounded, we 
can define F = max tG ( fcT)fcT+To ) (\Fu(t)\ -f |Te(£)|). In the previous section, we 
have proven that the force estimation error dynamics are ISS and consequently ep(t) is 
bounded; therefore, there exists E F = Hiax te (/ C T,fcT+T ) (\ e F(t)\), with Ep bounded. 
Considering the three terms in the right-hand side of (05]), we can conclude that the first 
one is bounded due to the boundness of the discrete-time error dynamics and the fact 
that t G [0, Tq], the second term satisfies: 



lt- 
1 

and the third term satisfies: 



B n u(kT 



It 




1 



B 12 e F (kT + r)dr 




(46) 



(47) 



Therefore, we can conclude that the inter-sample evolution of tracking error is also 
bounded for t G (kT, kT + To). Once more, we can employ the result from [22] to 
prove that the continuous-time tracking error dynamics ©, (|7]) is ISS with respect to 
the inputs Fh, Fe and e F because the discrete-time tracking error dynamics is ISS and 
the intersample behavior is uniformly globally bounded over T. 

Since the force estimation error dynamics ©, ([5]) is ISS with respect to the input Fe 
and the tracking error dynamics ©, © is ISS with respect to the inputs Fh, Fe and ep, 
we use the result introduced by [21 ] concerning the series connection of ISS systems 
to conclude that the closed-loop system ©-(IT]), see also Figure[T]with the controller C 
with the block diagram representation from Figure [3] is ISS with respect to the inputs 
Fh, Fe andT^. 

Remark 2. By studying the ISS property of the estimation and tracking error dynam- 
ics, one can observe that the steady-state force estimation and tracking errors can be 
influenced by tuning the parameters T, To, K$, K\ and K 2 . The algorithm provides a 
deeper insight into these relations. If we consider the converging manifold that bounds 
the error signal we can determine these parameters in accordance with the desired 
convergence rate. 

Remark 3. In case the environmental force Fe is constant, i.e. Fe = 0, the force 
estimation dynamics are globally exponentially stable and the tracking error dynamics 
is ISS with respect to the inputs Fh and Fe. This means that "perfect" haptic feedback 
is provided and that bounded tracking errors remain; therefore the closed loop system 
is stable. 
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Remark 4. The exact "tracking" regulation of the slave robot with respect to what the 
human has in mind is up to the human (since the human is in charge of the ultimate 
positioning). 

4 An Illustrative Example 

In this section, we will apply the control design proposed in the previous section to 
a master-slave teleoperation setup consisting of two 1-DOF robots. The inertia of the 
robots is considered to be in the range trm, ms € [0.1, 10] kg. 

The "human" controller has been emulated by a linear transfer function: 



H(s) = 



K d {T d s + 1) 500(1 + s) 



T PL s + 1 



0.1s + 1 



(48) 



with saturation at ± 1007V. Here we use real human parameters, since the human move- 
ment is lower than 6Hz. Moreover, to comply with the human sensing range, which is 
between 0Hz and 40 — 400Hz depending on the amplitude of the input signal, we have 
chosen the parameters of the cycle period of the controller T = 0.01s and the duration 
of the first stage To = T/2 = 0.005s. The force estimator acting in the first phase of 
the algorithm is defined by parameter K$ = 10 5 . The tracking PD controller which is 
active during the second phase has the parameters K\ = 200 and K2 = 1. 

In Figure |U we present the results of the simulation of the master-slave position 
tracking when the "human" is performing a movement from 0m to 0.25m on the master 
robot (i.e. the "human" force Fh is computed such that this motion is achieved) and a 
sinusoidal external force Fe(£) = 0.5sin(27rt) is disturbing the slave robot. The dotted 
line is the position of the master and the solid line is the position of the slave. 

One can observe that because no disturbance rejection controller is implemented, the 
external force is preventing the position signal to settle exactly at 0.25m. In Figure[5j a 
zoomed in version of the Figure |4] that emphasizes this aspect is presented. 



£ 0.3 



15 20 

Time [s] 



25 30 



Fig. 4. Position tracking. 
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Fig. 5. Position tracking (zoomed version of Figure |4}. 

5 Conclusions and Perspectives 

In this paper, we have introduced a new control algorithm for bilateral teleoperation of 
3D robots in force-sensorless setups using a switching strategy between a force estimat- 
ing controller and a tracking controller. This switching algorithm guarantees both the 
estimation of the environmental force acting upon the slave robot (to be used in haptic 
feedback) in the absence of force sensors and the convergence of the tracking errors in 
the case of external perturbations acting on the slave robot. We note that the ultimate 
position setting is the responsibility of the human, as he is in charge of the position of 
the master robot. Finally, we remark that the proposed algorithm is robust for an un- 
known inertia of the load to be carried by the slave robot and an unknown inertia of the 
human hand on the master side. 

Future perspectives of this work will mainly focus on an extension to multi-degree- 
of- freedom robots with nonlinear dynamics. The reader should note that the extension 
of this result for robots with nonlinear dynamics using a pseudo-linearizing controller 
as in [20] is not straight forward, due to the variable change which depends on the 
velocities and accelerations. 
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Abstract. Grasping is one of the most important abilities needed for future ser- 
vice robots. Given the task of picking up an object from betweem clutter, tradi- 
tional robotics approaches would determine a suitable grasping point and then 
use a movement planner to reach the goal. The planner would require precise and 
accurate information about the environment and long computation times, both 
of which may not always be available. Therefore, methods for executing grasps 
are required, which perform well with information gathered from only standard 
stereo vision, and make only a few necessary assumptions about the task envi- 
ronment. We propose techniques that reactively modify the robot's learned motor 
primitives based on information derived from Early Cognitive Vision descriptors. 
The proposed techniques employ non-parametric potential fields centered on the 
Early Cognitive Vision descriptors to allow for curving hand trajectories around 
objects, and finger motions that adapt to the object's local geometry. The methods 
were tested on a real robot and found to allow for easier imitation learning of hu- 
man movements and give a considerable improvement to the robot's performance 
in grasping tasks. 

Keywords: Dynamical motor primitives, Early cognitive vision descriptors, 
Grasping. 



1 Introduction 

Consider the scenario wherein you want to have a humanoid robot grasp an object in 
a cluttered space. The first stage of most grasp planners determines a suitable grasp 
location on the object 12111131 . Having selected a final location and orientation for the 
hand, the robot must then determine how to execute the grasp so as not to collide with 
the object or any of the surrounding objects. 

The traditional solution for this scenario involves supplying the robot with a CAD 
model of the objects and a laser scanner or other means (ERFID, previous position, etc.) 
for obtaining their precise positions. These tools give the robot ample knowledge to ap- 
ply a planning algorithm that determines a suitable path to the goal. This process relies 
on precise sensor information and can be very time consuming given a complex scene 
with numerous possible object collisions to test for at each step. In contrast, humans 
can perform successful grasps of objects in the periphery of their vision, where visual 
information is limited. 

J. A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 211 4223.1 
springerlink.com © Springer- Verlag Berlin Heidelberg 2011 
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Fig. 1. The robot used in our experiments and an example of a grasping task in a cluttered 
environment. 

Taking inspiration from human movements, we propose a reactive method for robots 
grasping objects in cluttered environments using potential fields based on only a small 
amount of visual information. Specifically, we present methods for incorporating infor- 
mation derived from Early Cognitive Vision Descriptors (ECVD) [ 20 ] into the dynami- 
cal system motor primitives (DMP) [ 22 ] framework. The Early Cognitive Vision system 
(see Appendix and Figure was chosen since it makes only a few assumptions about 
the object being grasped, while the motor primitives (see Appendix) were chosen be- 
cause they generalize well to new situations and can be learned through imitation lfT2l . 
The two frameworks are also compatible with each other and thus straightforward to 
combine. 

The ECVDs were used to elegantly augment the DMPs for grasping tasks, resulting 
in the robot being able to avoid obstacles, curve its reaching trajectories around the 
object to grasp, and adapting the fingers to the local geometry of the object. 

2 Methods for Reactive Grasping 

The methods proposed in this section were inspired by human movements. Human 
grasping movements can be modeled as two linked components, transportation and fin- 
ger posture, synchronized by a shared timer or canonical system [5 18]. Transportation 
refers to the actions of the arm in moving the hand, while the finger posture aspect 
relates to the preshaping and closing of the fingers 1 14] . 

Humans perform the reaching/transportation component in a task-specific combina- 
tion of retina and hand coordinates |8], which allows for easier specification of object 
trajectories in a manipulation task than joint coordinates would and results in a reduc- 
tion in dimensionality. These movements also have curved trajectories that are needed 
for avoiding obstacles and reaching around objects, which mainly occurs in a planar 
subspace (24). 

Similar to the transportation component, the main purpose of the finger posture com- 
ponent is to preshape the hand by extending the fingers sufficiently for them to pass 
around the object upon approach, and then close on the object simultaneously for a 
good grasp. Over-extending the fingers is undesirable as it makes collisions with the 
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Fig. 2. The green ECVD represent the object to be grasped, while the surrounding ECVDs in the 
scene are clutter. The coordinate frame of the third finger of the Barrett hand (the lower finger 
in the image) and variables used in section [2] are shown. The x-y-z coordinate system is located 
at the base of the finger, with z orthogonal to the palm, and y in the direction of the finger. The 



marked ECVD on the left signifies the j descriptor, with its position at Vj = (vj x , 



o J 



and edge direction ej : 
p = (p.x,p.y,p.z) T . 



(ej X , ejy , ej Z ) T of unit length. The position of the finger tip is given by 



environment more likely and is therefore usually restricted to situations that present 
large uncertainties about the object 1 17 5 ]. 

Curved reaching trajectories and preshaping of the hand were incorporated into the 
robot via a potential field, as described in Sections |2J1 and [2721 Subsequently, a higher 
level controller is proposed in Section [231 which allows the grasping movements to be 
interpolated better to new target grasp locations. 



2.1 DMP Based Attractor Field 

The first step towards specifying the grasping movements is to define an attractor field as 
a DMP that encodes the desired movements given no obstacles. The principal features 
that need to be defined for these DMPs are 1) the goal positions, and 2) the generic 
shape of the trajectories to reach the goal. 

Determining the goal posture of the hand using the ECV descriptors has been pre- 
viously investigated in. (6). In this work, possible grasp locations were hypothesized 
from the geometry and color features of the ECVDs, and used to create a kernel density 
estimate of suitable grasps, which is then refined by attempting grasps to test them. 

However, this grasp synthesizer only gives the desired location and orientation of the 
hand, but leaves finger placement to a secondary finger controller, e.g., 1101231 . Using 
the ECVDs, the goal position of each finger is approximated by first estimating a contact 
plane for the object in the finger coordinate system shown in Figure[2] To make it a local 
approximation, the influence of the i th ECVD is weighted by W{ = exp(— cr~ 2 vf x — 
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a y 2v iy ~ a z 2v iz)' where cr x ,a y , and a z are length scale constants, and Vi is the position 
of the ECVD in the finger reference frame. The hand orientation was chosen such that 
the Z direction of the finger is parallel to the approximated contact plane, which reduces 
the problem to describing the plane as a line in the 2D X-Y space. The X-Y gradient 
of the plane is approximated by <j) = (J2i=i w i)~ 1 J2i=i w i arctan(e^/e^), where N 
is the number of vision descriptors, and e$ is the direction of the i th edge. The desired 
Y position of the fingertip is then given by 

~ _ ^ZjLi( w i v iy - taxi((/>)wiVi X ) 

Py ~ y^at ' 

which can be easily converted to a joint parameter using the inverse kinematics of the 
hand. 

Having determined the goals of both transportation and finger-posture components, 
the next step is to define the trajectories used to reach these goals. Many of the beneficial 
traits of human movements, as described earlier, can be transferred to the robot through 
imitation learning. Learning by imitation involves a human demonstrating a motion and 
the robot then mimicking the movement. Details for imitation learning with DMPs can 
be found in (Y2\ . 

We can now combine the goals and imitation learned trajectories to specify the DMPs 
and thus the attractor fields. 

2.2 ECVD Based Detractor Fields 

Having specified the rudimentary grasping movements, a detractor field is employed 
to refine the motions in order to include obstacle avoidance for the transportation and 
ensure that the finger tips do not collide with the object during the hand's approach. 

The detractor field will be based on ECVDs, which can be envisioned as small line 
segments of an object's edges localized in 3D, as shown in Figure[2]for a scene as shown 
in Figure [T] The detractive potential fields for ECVDs are characterized by two main 
features; i.e., the detractive forces of multiple ECVDs describing a single line do not 
superimpose, and the field does not stop DMPs from reaching their ultimate goals. The 
system therefore uses a Nadaray a- Watson model |4] of the form 



.7=1 



Tn 



to generate a suitable detractor field, where Vi is a weight assigned to the i th ECVD, s 
is the strength of the overall field, x is the state of the DMPs' canonical system, and q 
is the detracting force for a single descriptor. 

The weight of an ECVD for collision avoidance is given by r{ = exp(— (v^ — 
p) T h(v^ — p)), where v$ is the position of the i th ECVD in the local coordinate system, 
h is a vector of positive length scale hyperparameters, and p is the finger tip position, 
as shown in Figure [2] The detractor therefore puts more importance on ECVDs in the 
vicinity of the finger. 
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The strength factor ensures that the detractor forces always tend to zero at the end 
of a movement and thus it can not obstruct the attractor from achieving its goal at the 
end. Therefore, the strength of the detractors is coupled to the canonical system of the 
DMP; i.e., s(x) = Q^ 7 - =1 ^j)~ l J2i=i ^i w i x > where x is the value of the canonical 
system, ip are its basis functions, and w specify the varying strength of the field during 
the trajectory. 

The transportation and finger-posture movements react differently to edges and thus 
employ different types of basis functions C{ for their respective potential fields. For the 
fingers, the individual potential fields are logistic sigmoid functions about the edge of 
eachECVDof the form p(l + exp(^<j~ 2 ) ) -1 , where di — (p — v$) - e^(p - v^) T eJ 
is the distance from the finger to the edge, p > is a scaling parameter, and a c > is 
a length parameter. Differentiating the potential field results in a force term of 

exp (di(j~ 2 ) 



(l + exp(^a c - 2 )) 2 ' 

As the logistic sigmoid is monotonically increasing, the detractor always forces the 
fingers open further to move their tips around the ECVDs and thus ensure that they 
always approach the object from the outside. 

The hand uses instead the Gaussian basis functions of the form ^exp(-0.5d^dicr~ 2 ), 
where d^ = (q — v^) — e^(q — Vi) T e^ is the distance from the end effector position, 
q, to the edge, and g > and ad > are scale and length parameters respectively. 
Differentiating the potential with respect to d^ gives a force term in the Y direction of 



Ci = 



gdicr d 2 exp(-0.5d^ r d i a d 2 ) 



which can be interpreted as a radial force from the edge with an exponentially decaying 
magnitude. 

The detractor fields, of both the grasping and reaching components, have now been 
defined, and can be superimposed into the DMP framework as 

y = (a z (p z r~ 2 (g - y) - r^y) + ar~ 2 f(x)) - r~ 2 u, 

which then represents the entire ECVD and MP based potential field. 

2.3 High Level DMP Controller for Grasping 

Having defined the potential field for a single grasping motion, we interpolate the move- 
ments to new target grasps. Having a motion representation that can be interpolated to 
new targets is crucial for imitation learning. Given such a representation, the number of 
example trajectories required from the demonstrator can be greatly increased, making 
learning easier. While DMPs can interpolate to arbitrary goal positions, they have two 
drawbacks for grasping tasks; i.e., 1) the approach direction to the grasp can not be 
arbitrarily defined, and 2) the amplitude of the trajectory is unneccessarily sensitive to 
changes in the start position yo and the goal positon g if yo w g during training, which 
can cause the robot to reach the limits of its workspace. 
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Fig. 3. The above diagram shows the the change in coordinate systems for the transportation 
DMPs. The axes X w -Y w -Z w are the world coordinate system, while X p -Y p -Z p is the planar 
right handed coordinate system in which the DMP is specified. The trajectory of the DMP is 
shown by the pink line, starting at the green point, and ending at the red point. Note that X p 
is parallel to the approach direction of the hand, which is shown by the black arrow a. The 
planar axis Y p is perpendicular to X p , and pointing from the motor primitive's starting location s 
towards the goal g. 

These difficulties can be overcome by including a supervisory controller that mod- 
ifies the hyperparameters of the DMPs appropriately. The supervisor can maintain the 
correct approach direction by using a task-specific coordinate system. Due to the trans- 
lation invariance of DMPs, only a rotation, R G SO (3), between the two coordinate 
systems needs to be determined. The majority of the motions will lie in a plane defined 
by the start and goal locations, and the final approach direction. 

The first new in-plane axis x p is set to be along the approach direction of the grasp; 
i.e., x p = —a as shown in FigureO As a result, the approach direction is easily defined 
and only requires that the Y p and Z p primitives reach their goal before the X p primitive. 
The second axis, y , must be orthogonal to x p and also in the plane, as shown in Fig- 
ureO It is set to y p = 6 _1 ((g — s) — x p (g — s) T x p ), where 6 _1 is a normalization term, 
and s and g are the motion's 3D start and goal positions respectively. The third vector, 
z p , is orthogonal to the plane, and is derived by completing the right-handed coordinate 
system, i.e., z p = x p x y The DMPs can now be specified by the supervisor in the 
Xp-Yp-Z p coordinate system, and mapped to the X w -Y w -Z w world reference frame by 
multiplying by R T = [x p , y p , z p ] T . 

The second problem relates to the scaling of motions with ranges greater than y^ — g, 
which both components require to move around the outside of objects. In the stan- 
dard form a = g — yo (13), which can lead to motions that easily exceed the robot's 
workspace if g « yo during the training, but not during the motion reproduction. The 
supervisor can control these trajectories by scaling the shaping force (see Appendix), 
and thus we propose the amplitude term 



a = \\v(g ~ Vo) + (1 - v)(9t ~ 2/ot)|| , 
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Effects of DMP Amplitude a Factor 
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Fig. 4. This is a demonstration of the effects of augmenting the amplitude variable a of DMPs. 
The black lines represent boundaries. The green plot shows the trained trajectory of the DMP 
going to 0.05, and is the same for all amplitude values. Now consider the scenario wherein the 
goal is placed at 0.1, but the workspace is limited to ±0.75 (top boundary). The dashed red 
line is the standard generalization to a larger goal, while the red plot uses the new amplitude. 
Notice how the new amplitude restricts the range of the trajectory to the workspace. In a different 
scenario, we move the goal to —0.05, but require the goal to be reached from above (lower right 
boundary), e.g., a finger placed on a surface. The dashed blue line is the standard generalization 
to a negative goal, and the blue trajectory uses the new amplitude. Note that the trajectory is 
not flip in the case of the new amplitude and thus stays within the restricted region. Both of the 
new trajectories were generated with 77 = 0.25, and maintain shapes close to that of the training 
trajectory. 

where gr and yor are the goal and start positions of the training data respectively, 
and r] G [0, 1] is a weighting hyperparameter. The resulting trajectory amplitude is in 
the convex hull of the training amplitude and the standard interpolation value (a = 
9 — Vo) [13 1 and thus only affects how conservative the generalization to new points is, 
as can be seen in Figure[4j By taking the absolute value of the amplitude, the approach 
direction is not reversed, giving a result similar to the use of a constant amplitude pro- 
posed by Park et al. |[T9l , which corresponds to the special case of r\ = 0. Example 
interpolations of a transportation trajectory can be seen in Figure [3 



3 Grasping Experiments 

The methods described in Section [2] were implemented and evaluated on a real robot 
platform. The robot consists of a Videre stereo camera mounted on a pan-tilt unit, a 
Barrett hand, and a Mitsubishi PA 10 arm. The robot was given the task of grasping an 
object amongst clutter using only an ECVD model of the object. The results of these 
trials were then compared to trials of the same grasps using other standard robotics 
methods for comparison. We hypothesize that our method will result in significantly 
more successful grasps than the other methods. 
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Interpolation of Reaching Around an Object 




X Position [m] 



Fig. 5. The plot shows workspace trajectories, wherein the x and y values are governed by two 
DMPs sharing a canonical system. The red lines indicate the desired approach direction while the 
green semicircle indicates the goal positions along them. The blue lines show the trajectories for 
the different goals. They make use of the higher level controller of Subsection |2.3l with 77 = 0.25. 
The approach direction DMP was trained on an amplitude of one. 



3.1 Grasping Experiment Procedure 



Before the robot can perform a grasping task, its motions must be initialized. Determin- 
ing the finger goal state and specifying the detractor fields introduces several new hy- 
perparameters that have simple geometrical interpretations. For instance, h = 2[w I l] T , 
where w and / are the width and length of the finger respectively. To reflect the human 
tendency towards more precise movements during the last 30% of a motion |'15), the 
strength function, s(x), was set to give the highest strengths during the first 70% of the 
motion for the transportation, and the last 30% for the finger posture. 

A VICON™ motion tracking system was used to record the movements of a human 
test subject during a grasping task, which used a different object to the one used by 
the robot. As the reaching trajectories are encoded in task space rather than joint space, 
the correspondence problem was not an issue for the imitation learning. Similarly, the 
DMPs of the fingers are homogeneous, which circumvents the correspondence problem 
of mapping the five human fingers onto the three fingers of the robot. The imitation 
learning was performed using locally weighted regression in the the X p -Y p -Z p coor- 
dinate system, as proposed by Ijspeert et al. fl2l . 

Having defined the basic motions, the robot was then given the task of grasping an 
object without hitting surrounding obstacles (see Figure [TJ. Each trial begins with an 
estimate of the pose of the object relative to the robot [7 ] and sets its grasp location ac- 
cordingly. The model's ECVD are then projected into the scene, and the robot attempts 
to perform the grasp and lift the object 15cm so that it is clear of the stand. The trial 
is a success if the robot can detect the object in its hand at this point. If the hand col- 
lides with an obstacle or knocks the object down, the trial is marked as a failure. Grasps 
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Fig. 6. The occurrences of successes and collision types for the different methods are shown. 
The first column presents the results for the traditional robotics method of specifying trajectories 
by via points. The second column corresponds to using standard DMPs, while the final column 
incorporates the ECVD based potential field and supervisory DMP controller. The occurrences 
are given as the percentage of trials. Trials that collided multiple times, are classified by their first 
collision. 



were varied to include different approach directions and locations around the object. 
The experiment consisted of 45 trials. 

Two alternative approaches were compared with our proposed method. The first rep- 
resents a standard robotics approach of specifying a trajectory by straight lines between 
via points and uses fully extended fingers with no preshaping of the hand. The other ap- 
proach is to use standard DMPs learned from the same human demonstrated movements 
as our proposed methods, but without the proposed detractor field and supervisory con- 
troller. The same grasp locations were proposed to the different methods, and obstacles 
were placed in similar positions for the the different trials to allow for a fair comparison 
between the methods. 



3.2 Experimental Results 



From the three tested methods, the proposed method acquired the highest success rate, 
as can be seen in Figured The task was not trivial, and all of the methods encountered 
both successes and problems during the trials. 

The standard DMP method encountered the most problems (success rate of only 7%) 
a majority of which were caused by collisions with the object. This high failure rate can 
be attributed to the method not specifically incorporating a desired approach direction. 
In successful trials, the approach direction was close to that of the initial imitation 
learning. Therefore the proposed DMP supervisor improved the generalization of the 
movement to new target grasps, and the system would benefit from it even in uncluttered 
environments. Similarly, the open-loop preshaping of the hand helped avoid obstacles, 
but occasionally prevented the hand from being sufficiently open to accept the object. 
The proposed detractor field successfully overcame this problem for the ECVD DMPs. 
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The via points method encountered no collisions with the object, and would have 
worked well in an uncluttered environment. The method still encountered collisions 
with the obstacles for 73% of the trials, but this is more reflective of the difficulty of the 
task rather than the via point method. The method can therefore be considered as a good 
approach if it were combined with a suitable path planning method for obstacle avoid- 
ance. However, the path planner would need additional information and assumptions 
about the scene and possibly even extra hardware to acquire it. 

The proposed method had a success rate of 93%, with no occurrences of collisions 
with obstacles. The trials that did fail were the result of the object falling down while 
the fingers were closing and thus do not indicate problems with the approach used 
to reaching the grasp location. The method does have certain restrictions though. The 
magnitude of the detractor fields needs to be calibrated based on the density of ECVDs 
for common objects, but some obstacles encountered may present lower densities. As 
the current set of ECVD relies on object edges, smooth objects can lead to noisy or 
very sparse descriptors, and therefore not create a suitable basis for obstacle avoidance. 
As the number of descriptor types increases (e.g., corner and plane descriptors), this 
will become less of a problem. Occluded obstacles will also need to rely on additional 
information (e.g., force feedback) to be avoided, although this is a source of error for 
all vision based planners. 

Given a few restrictions, the results still show that our hypothesis was correct and 
the proposed methods represent a suitable basis for avoiding obstacles without relying 
on a complicated path planner and using only a small amount of vision information 
compared to standard robot systems. 

4 Conclusions 

The proposed methods augment dynamical system motor primitives to incorporate Early 
Cognitive Vision descriptors by using a potential field. These methods represent impor- 
tant tools that a robot needs to reactively execute grasps of an object in a cluttered 
environment without relying on a complex planner. The techniques allow for preshap- 
ing the fingers to match the shape and size of the object and curving the trajectory of 
the hand around objects! 24]. These modifications were tested on a real robot, and it was 
discovered that the methods were not only successful at performing the task, but also 
allowed for easier imitation learning, better interpolation of the learned trajectories, and 
significantly better chances of a success of a grasp in cluttered environments than stan- 
dard motor primitives. Although the experiments were performed within a grasping task 
scenario, the proposed methods can be beneficial for other manipulation tasks, such as 
pressing buttons and pushing objects. 

References 

1. Arimoto, S.: Control Theory of Multi-fingered Hands. Springer, London (2008) 

2. Bard, C, Troccaz, J., Vercelli, G.: Shape analysis and hand preshaping for grasping. In: 
Proceedings of IROS 1991(1991) 

3. Bicchi, A., Kumar, V.: Robotic grasping and contact: a Review. In: Proceedings of 
ICRA 2000 (2000) 



Grasping with Vision Descriptors and Motor Primitives 221 

4. Bishop, CM.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006) 

5. Chieffi, S., Gentilucci, M.: Coordination between the transport and the grasp components 
during prehension movements (1993) 

6. Detry, R., Kroemer, O., Popovic, M., Touati, Y., Baseski, E., Krueger, N., Peters, J., Piater, 
J.: Object- specific grasp affordance densities. In: ICDL (2009) 

7. Detry, R., Pugeault, N., Piater, J.: Probabilistic pose recovery using learned hierarchical ob- 
ject models. In: International Cognitive Vision Workshop (2008) 

8. Graziano, M.S.: Progress in understanding spatial coordinate systems in the primate brain. 
In: Neuron (2006) 

9. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge 
University Press, Cambridge (2000) 

10. Hsiao, K., Nangeroni, P., Huber, M., Saxena, A., Ng, A.: Reactive grasping using optical 
proximity sensors. In: Proceedings of ICRA 2009 (2009) 

11. Iberall, T.: Grasp planning for human prehension. In: Proceedings of ICAI 1987 (1987) 

12. Ijspeert, A. J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical sys- 
tems in humanoid robots. In: ICRA (2002) 

13. Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning attractor landscapes for learning motor 
primitives. In: NIPS (2003) 

14. Jeannerod, M.: Grasping Objects: The Hand as a Pattern Recognition Device. Perspectives 
of Motor Behaviour and Its Neural Basis (1997) 

15. Jeannerod, M.: The study of hand movements during grasping. A historical perspective. In: 
Sensorimotor Control of Grasping: Physiology and Pathophysiology, Cambridge University 
Press, Cambridge (2009) 

16. Krueger, N., Lappe, M., Woergoetter, R: Biologically motivated multimodal processing of 
visual primitives. The Interdisciplinary Journal of Artificial Intelligence and the Simulation 
of Behaviour (2004) 

17. Oztop, E., Bradley, N.S., Arbib, M.A.: Infant grasp learning: a computational model (2004) 

18. Oztop, E., Kawato, M.: Models for the control of grasping. In: Sensorimotor Control of 
Grasping: Physiology and Pathophysiology, Cambridge University Press, Cambridge (2009) 

19. Park, D.-H., Hoffmann, H., Pastor, P., Schaal, S.: Movement reproduction and obstacle avoid- 
ance with dynamic movement primitives and potential fields. In: IEEE International Confer- 
ence on Humanoid Robots(HUMANOIDS) (2008) 

20. Pugeault, N.: Early Cognitive Vision: Feedback Mechanisms for the Disambiguation of Early 
Visual Representation. Vdm Verlag Dr. Mueller (2008) 

21. Saxena, A., Dreimeyer, J., Kearns, J., Osondu, C, Ng, A.: Learning to Grasp Novel Objects 
using Vision. In: Experimental Robotics. Springer, Berlin (2008) 

22. Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.: Learning movement primitives. In: Proceed- 
ings of ISRR 2003 (2003) 

23. Steffen, J., Haschke, R., Ritter, H.: Experience-based and tactile-driven dynamic grasp con- 
trol. In: Proceedings of IRS 2007 (2007) 

24. Wank, V, Fischer, A., Bos, K., Boesnach, I., Moldenhauer, J., Beth, T.: Similarities and 
varieties in human motiontrajectories of predefined grasping and disposing movements. 
International Journal of Humanoid Robotics (2004) 



222 O. Kroemer et al. 

APPENDIX 

Dynamical Systems Motor Primitives 

The dynamical systems motor primitives (DMPs) proposed by Ijspeert et al. 1 13] were 
inspired by the simple, but highly adaptive, motions that animals employ, and combine 
to obtain more complex motions. The primitives are implemented as a passive dynami- 
cal system with an external force, and represented as 

V = a z (p z r~ 2 (g - y) - r^y) + ar~ 2 f(x), (1) 

where a z and /3 Z are constants, r controls the duration of the primitive, a is an ampli- 
tude, f(x) is a nonlinear function, and g is the goal for the state variable y. 

By selecting a z and /3 Z appropriately, and setting a = 0, the system reduces to 
y = a z ((3 z r 2 (g — y) — ry) and becomes a critically damped global attractor. It can 
be visualized as a spring and damper system that ensures state y will always end at the 
goal value g. 

The function f{x) is a shaping function based on the state, x G [0,1], of the canonical 
system that synchronizes the DMPs x = — a x rx, where a x is a time constant. The 
function takes the form 

Ei=i ^ 0) 

where M is the number of basis functions, ip(x) are Gaussian basis functions, and w 
are weights acquired through locally weighted regression |[T3l . This function has the 
effect of introducing a non-linearity that can affect the spring-damper system to output 
any arbitrary trajectory specified by the user. Due to the dependence of f(x) on x, the 
shaping term decays to zero with x, so that the spring and damper beneficial properties 
of the attractor are maintained. 

The resulting primitives can encode arbitrary trajectories, and still ensure that the 
goal state is always achieved. The trajectories can also be scaled in time and space by 
setting the r and g variables appropriately and thus generalize to a range of situations. 

Early Cognitive Vision System 

The entire prehensile process effectively occurs before the hand has even touched the 
object and thus the vision system plays a very important role 121111 . Our system uses 
the Early Cognitive Vision methods of Pugeault et al. [20.9], which makes a minimal 
number of assumptions about the object, and has been successfully implemented to 
determine good grasp locations |6|. A principal idea of this vision system is to store 
additional low level information and perform perceptual grouping on it to later aid the 
higher level stereo matching and 3D constructions. 

The methods extract local features of a scene, which it localizes and orientates in 
space |[T6l . Each descriptor is a symbolic representation for an edge in 3D. The result- 
ing features are called early cognitive vision descriptors (ECVD) [20], and can be used 
in generating models of objects for pose estimation [7], and for symbolically describ- 
ing 3D scenes. By using a large amount of small ECVDs, any arbitrary object can be 
represented. 
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When performing a grasping task, the robot uses a hierarchical Markov model of 
the object's ECVD geometry [7 ] to determine its pose, which can then be used to su- 
perimpose the ECVDs of the model back into the scene. The grasping techniques can 
therefore use geometric information of a partially occluded object. 
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Abstract. In this chapter, the tracking control problem for a unicycle-type mo- 
bile robot with network-induced delays is addressed. The time-delay affects the 
system due to the fact that the controller and the robot are linked via a delay- 
inducing communication channel, by which the performance and stability of the 
system are possibly compromised. In order to tackle the problem, a state estima- 
tor with a predictor- like structure is proposed. Acting in conjunction with a track- 
ing control law, the resulting control strategy is capable of stabilizing the system 
and compensates for the negative effects of the time-delay. The local uniform 
asymptotic stability of the closed-loop system is guaranteed up to a maximum 
admissible time-delay, for which explicit expressions are provided in terms of the 
system's control parameters. The applicability of the proposed estimator-control 
strategy is demonstrated by means of experiments carried out between multi- 
robot platforms located in Eindhoven, The Netherlands and Tokyo, Japan. 

Keywords: Mobile robot, Remote tracking control, Network delay, Nonlinear 
estimator, Non-holonomic systems. 



1 Introduction 

In the increasingly fast and diverse technological developments of the last decades the 
duties and tasks conferred to control systems have become much more complex and de- 
cisive. Requirements now encompass flexibility, robustness, ubiquity and transparency, 
among others. 

Specifically, the study of robotic systems controlled over a network has become sig- 
nificatively important as a way to support the design of robotic systems that can perform 
remote, dangerous or distributed tasks. The remote control, or the control of a system 
subject to a network-induced delay is important in e.g. teleoperation strategies and is a 
central topic in Networked Control Systems (NCS). 

Several techniques have been proposed so far in order to cope with network-induced 
delays in these settings; e.g. the use of scattering transformations, wave variables for- 
mulation, queuing methodologies, delay compensation techniques and robust control 
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design to name a few. A detailed description of such techniques and many others, to- 
gether with further references, can be found in e.g. (H, Q, (T9). 

In this work, a control strategy for the remote tracking control of a unicycle-type 
mobile robot is proposed. The network-induced delay is compensated by means of a 
state estimator inspired by the predictor based on synchronization presented in lfT2l . 
lfT3l . The main idea behind the state estimator is to reproduce the system's behavior 
without delay in order to drive an anticipating controller. The problem presents vari- 
ous challenges since the system is nonlinear and subject to a non-holonomic constraint. 
Additionally, the difficulties faced when implementing the proposed ideas in an exper- 
imental setting using the Internet as the communication channel should be taken into 
account and are also discussed in depth. In [8], a similar state estimator has been ap- 
plied to a mobile robot subject to a communication delay, and sufficient conditions for 
the estimator's convergence have been derived. In this work an alternative approach is 
taken in order to prove the stability of the entire closed-loop system consisting of the 
mobile robot, the tracking controller and the state estimator. 

This chapter is structured in the following way. Section[2]recalls results on the track- 
ing control of a delay-free unicycle-type mobile robot. In Section [3] a control scheme 
intended to control a mobile robot with a network-induced time-delay is proposed and 
conditions on the maximum allowable time-delay in terms of the control parameters are 
posed. Section|4]provides an overview of the experimental platform used to validate the 
control strategy proposed, explains how the most critical implementation issues have 
been addressed, and presents the experimental results. Finally, conclusions are provided 
in Section [5] 

2 Tracking Control of a Unicycle Robot 

The tracking control design for a unicycle-type mobile robot is discussed in this section. 
To begin with, consider the posture kinematic model of a unicycle: 

x(t) = v(t) cos 0(t), 

y(t) =v(t) sin 0(t), (1) 

in which x(t) and y(t) denote the robot's position in the global coordinate frame X-Y 
(cf. FigureQ]), 0(t) defines its orientation with respect to the X-axis, and v(t) and uj(t) 
constitute the robot's translational and rotational velocities, respectively, and are re- 
garded as the system's control inputs. The robot's state is defined by q(t) = 
[x(t) y(t) 0(t)] T and the non-slip condition on the unicycle's wheels imposes a non- 
holonomic constraint on the system, as explained in (TJ. 

The control objective is to track a time- varying reference trajectory specified by 
q r (t) = [x r (t) y r (t) r (t)] T . The reference position (x r (t),y r (t)) satisfies the dy- 
namics, 

x r (t) = v r (t) cos r (t) , 

y r (t) = v r (t) sin 6 r (t), 
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Fig. 1. Mobile robot, reference system, and error coordinates. 



while the reference orientation r (t), translational velocity v r (t), and rotational veloc- 
ity uj r (t) are defined in terms of the reference Cartesian velocities x r (t), y r (t) and 
accelerations x r (t), y r (t) as follows: 



r (t) = arctan2 (y r (i), x r (t)) , 



v r (t) = y/x*(t)+y?(t), 

x r (t)y r (t) -x r (t)y r (t) 



uo r (t) 



xm + y 2 r (t) 



-(*), 



(3) 
(4) 

(5) 



where atan2 is the arctangent function of two arguments. It is worth noting that com- 
puting (O and (01 requires either x r (t) ^ or y r (t) ^ at all times. 

The difference between the reference trajectory and the state evolution may be ex- 
pressed with respect to the system's local coordinate frame X'-Y' in order to define the 
error coordinates q e (t) = [x e (t) y e (t) e (t)] T , as proposed by |7| and shown in Figure 
Q] These tracking error coordinates are given by, 



(6) 



x e (i) 




cos 0(t) sin0(t) 




x r (t) — x(t) 


Ve(t) 


= 


-s'm6(t) cos 0(t) 




y r (t) -y(t) 


e (t) 




1 




6 r (t)-6(t) 



Exploiting (Q]), (0, ([5]), and ©, the tacking error dynamics result in, 

x e (t) =u(t)y e (t) +v r (t)cos0 e (t) - v(i), 
y e (t) = -u(t)x e (t) +v r (£)sin0e(£), 
e (t) =uo r (t) -uo(t). 

The following tracking controller has been proposed in (6), (HJ, 

v(t) = v r (t) H- c 2 x e (t) - c 3 (j r (t)y e (t), 
uj{t) = uj r {t) + c\ sin0 e (£), 



(7) 



(8) 
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Fig. 2. Block diagram representation of proposed remote tracking control strategy. 

which ensures local exponential stability (LES) of the tracking error dynamics d)-® 
if ci, C2 > and cs > — 1. 



3 Remote Tracking Control 

In this section, we consider a mobile robot controlled over a network which induces 
time-delays, see Figure[2l The robot's controller, consisting of the tracking control law 
® and a state estimator, should ensure that the robot tracks (a delayed version) of the 
reference trajectory. The state estimator has a predictor-like structure and is similar to 
the one proposed in [8 ]. The origin of this type of predictor can be traced back to the 
appearance of the notion of anticipating synchronization in coupled chaotic systems, 
which was first noted by [20] . After the same behavior was observed in certain simple 
physical systems such as specific electronic circuits and lasers, it was studied for more 
general systems in lfT4l . As a result of this generalization, a state predictor based on 
synchronization for nonlinear systems with input time-delay was proposed in lfT2l . The 
same concept, which can be seen as a state estimator with a predictor-like structure, is 
proposed here for a mobile robot subject to a network-induced delay. 



3.1 State Estimator and Controller Design 

When considering a network-induced delay, the mobile robot is subject not only to a 
forward r/ (input) time-delay, but also to a backward r^ (output) time-delay, as denoted 
in Q. Hereinafter the forward and backward time-delays Tf, 75 will be assumed to 
be constant and known, with r := t^ + Tf. Given the mobile robot (QJ subject to a 
network-induced input delay Tf, the robot's posture kinematic model is given by, 



x(t) = v(t - 
y(t) = v(t - 
0(t) = u(t 



Tf) cos#(£), 
Tf) sin#(£), 

- r f)- 



(9) 
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Moreover, the system's state measurements are affected by a backward time-delay 75: 

q(t - n ) = [x(t - n) y(t - n ) 6{t - n )] T '. 

In order to improve the tracking performance when subject to a communication de- 
lay, the following state estimator, with state z(t) = [z\{t) z 2 (t) zs(t)] T , is proposed: 

Zi(t) = v{t) cos z s {t) +v x {t), 

z 2 (t)=v(t) sin z 3 (t) + v y (t), (10) 

z s {t)=Lj{t) + u e {t), 

with v(t) = [v x (t) v y (t) vo(t)] T defining a correcting term based on the difference 
between the estimator state and the measured state. 

For the purpose of designing the correcting term v(t), two new sets of error coordi- 
nates are introduced, namely z e (t) and p e (t). The first set of error coordinates relates 
to the difference between the estimator state z(t) and the reference trajectory q r (t): 



(11) 



The second set of error coordinates relates to the difference between the delayed esti- 
mator state z(t — f) and the delayed system state q(t — t&): 
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(12) 



where f := f/ + f& represents the sum of the modeled forward and backward network- 
induced delays. Recall that the time-delays are assumed to be known or, in other words, 
modeled perfectly, i.e. f/ = r/ and f& = r&, which yields f = r. 

Given the error coordinates (fT2k the correcting term v(t) is proposed as follows: 



(13) 



where K x , K y and Kq are the correcting term gains. 

The block diagram representation of the proposed control scheme is depicted in Fig- 
ure [51 and shows that the state estimator's output constitutes the controller's input. The 
tracking control law (0) will now make use of the estimated error coordinates (ITTb and 
will be given by, 



Vx(t) = 


-K x p le (t) cos z 3 (t) + K y p2 e (t) sin z 3 (t) , 


Vy(t) = 


-K x pi e (i) sin z 3 (i) - K y p 2e (t) cos z 3 (t) , 


Mt) = 


-K e smpsAt): 



v(t) = v r (t) + c 2 z le (t) - cscu r (t)z 2e (t), 
uj(t) = u r (t) +cisinz 3e (t). 



(14) 



Remark 1. Due to the input time-delay r/, the control action applied to the robot in © 
is given by: 

V(t -Tf) = V r (t -Tf) + C 2 Z le (t - Tf) - C 3 UJ r (t -Tf)z 2e (t -Tf), 
Cj(t -Tf) = CJ r (t - Tf) +CiShl2:3 e (£ -Tf). 
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The resulting control action already hints at how we would like the system to behave. 
Intuitively, the robot state q(t) should track the delayed reference trajectory q r (t — Tf). 
This will be examined in detail during the stability analysis in Section [3^21 



3.2 Stability Analysis 

The control objectives may now be defined as follows: 

- q(t) — > q r (t — Tf), the system states converge to the reference trajectory delayed 
byr/; 

- z(t) — > q(t + r/), the state estimator anticipates the system by t/-; 

- z(t) — > q r (t), the state estimator converges to the reference trajectory. 

Considering these control objectives and taking into account Remark [TJ the following 
control goal can now be formulated: 

Given the unicycle-type mobile robot $9§ subject to a network induced delay r = 
Tf + Tb, the state estimator (TQ, (T/2l)-([7Jl), and the control law ( 1771 ), ( T/41) , the robot 
should track a delayed version q r (t — Tf) of the reference trajectory. 

In order to meet this control goal we aim to prove the stability of the equilibrium 
point (z e ,p e ) = (zi e , Z2 e , Z3 e ,pi e ,p2 e ,P3 e ) = of the closed-loop system (l9l)-([T4l). 

Consider the following error coordinate definitions:^! = [z± e Z2 e p\ e P2 e ] T and £2 



[z3 e P3 e ] T , with z ie ,p ie 



1,2,3, defined in (TTTb and (fT2l) , respectively. Using these 



definitions, the resulting closed-loop error dynamics can be rearranged in the following 
form: 

i 1 (t) = A 1 (t,t-T)Zi(t)+A 2 £i(t-T)+g(t^i t ^2 t ), (15) 

6(*) = / 2 (*,6 t ), (16) 



where ^ 



1, 2, is an element of the Banach space C(n) = C([—r, 0], R n ) and is 



defined by the formula & t (s) = £i t (t + s) for 8 G [— r, 0] . By means of & f it is possible 
to represent a state ^ of the system throughout the interval t G [t — r, t] . 

The matrices and functions defining the right-hand side in dTSb-dTSl) are given by 
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(17) 
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with 

<7n = c\z 2& {t) I cos(sz 3e (t))ds — v r (t) / sm(szs e (i))ds, 
Jo Jo 

912 = -K e z 2e (t) / cos(sp 3e (t))ds, 
Jo 

921 = (v r (t) - cizi e (t)) / cos(sz 3e (t))ds, 

Jo 

922 = K e zi e (t) / cos(sp 3e (t))ds, 

Jo 

932 = -iy r {t-r) + c 2 z le (t-r) - c 3 u r (t - r)z 2e (t - r)) / sm(sps e (i))ds, 

Jo 

#42 = (v r (t - r) + c 2 z le (t - r) - c 3 a; r (£ - r)z 2e (t - r)) / cos(sp 3e (£))ds, 

Jo 

h 3 i = cip 2e {t) / cos(sz 3e (t - r))ds, 
Jo 

h 32 = -K e p 2e (t) / cos(sp 3e (t - r))ds, 
Jo 



JO 



/141 = —cipi e (t) / cos(8Z3 e (t — r))ds, 
'o 

^42 = K e pi e (t) I cos(sp 3e (t - r))ds. 
Jo 

The definition of a persistently exciting (PE) signal will be required in order to formu- 
late a stability result for the system dT5t-dT7t. 

Definition 1. A continuous function uo : R + — ► R w sa/d to Z?£ persistently exciting 
(PE) ifuo(i) is bounded, Lipschitz, and constants S c > a^J e > eiw? swc/z that, 

\/t > 0, 3s : t — 5 C < s <t such that \w(s)\ > e. 

The following theorem formulates sufficient conditions under which (z e ,p e ) = is a 
locally uniformly asymptotically stable equilibrium point of dT5T)-dT7t. 

Theorem 1. Consider the posture kinematic model of a unicycle-type mobile robot sub- 
ject to a constant and known input time-delay Tf, as given by $9^. The robot's reference 
position is given by (x r (t),y r (t)), whereas its reference orientation r (t) is given by 
0. Additionally, consider the tracking controller as given in f [7?l ), with the feedforward 
terms v r (t) and uo r (i) defined in (0) and @, respectively, and the feedback part based 
on the error between the reference trajectory and an estimate of the state, as given in 
( 1771) . Moreover, consider the state estimator f [7fll ), (T/21) -rfTJl), which uses state measure- 
ments delayed by a constant and known output time-delay 75. If the following conditions 
are satisfied: 
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- uj r {t) is bounded and persistently exciting; 

- the tracking gains satisfy ci, c<i > 0, c% > — 1; 

- the correcting term gains satisfy K x = K y = K < 0, Kq < 0; 

- the time-delay r = t^ + tj belongs to the interval < r < r max , with 



{^1 z!_\ 



min <^ — — , -— _ r \ , (18) 



where p > 1 anduo r = sup teR |o; r (t)|, 

£/z£/i, (z e ,p e ) = w a locally uniformly asymptotically stable equilibrium point of the 
closed-loop error dynamics ([731) -([771). /ft other words, z(t) — > g(t + r/) as t — > oo (7ft£ 
state estimator anticipates the state by tj) and q(t) — > q r (t — Tf) ast — > oo (7ft£ system 
tracks the reference trajectory delayed by Tf). 

Proof For brevity only a sketch of the proof is presented. Recall the closed-loop er- 
ror dynamics dTSb-dTTt and note that systems (TT5t-dT6l) form a cascade consisting of a 
nonlinear delayed system £2 (t) = $2 (t, £2 t )> interconnected to a linear time- varying de- 
layed system £1 (t) = Ai(t,t — r)£i (t) + ^£1 (t — r) by means of a nonlinear delayed 
coupling #(£,£i t ,£ 2t ). 

Based on Theorem 2 in [18], local uniform asymptotic stability of the equilibrium 
point (z e ,p e ) = of the predictor's closed-loop error dynamics may be established if 
the following conditions are satisfied, 

- the coupling term g(t, £i t , £2* ) vanishes when £2* ~~ ^ 0> i- e - 9(f, £i t , 0) = 0; 

- the unperturbed subsystem £i(t) = Al(£, t — r)£i(t) + ^2£i(t — r) is uniformly 
asymptotically stable; 

- subsystem £2^) = /^(^ £2 t ) is locally uniformly asymptotically stable. 

Let us now check the validity of these three conditions. Firstly, given g(t, £i t , £2*) as 
defined in (fTTl L it immediately follows that as £2 t — > 0, the coupling term vanishes and 
thus the first condition is satisfied. 

Regarding the second condition, subsystem £1 (t) = A± (£, t — r)£i (t) + ^£1 (t — r) 
can be represented by a cascade itself. Using a similar reasoning as for the original cas- 
cade (fT5l)-([T7l), the subsystem's uniform asymptotic stability is concluded if the time- 
delay satisfies the following condition: 

r < ~ l _ (19) 

y/p(K - u; r ) 

and the requirements for C2, C3, K x and K y stated in Theorem[T]are satisfied. 

In order to check the third condition, subsystem £2 (t) is first linearized around the 
equilibrium point zs e = ps e = 0. The uniform asymptotic stability of the linearized 
subsystem is ensured for 

r < -^-, (20) 
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Fig. 3. Maximum allowable time-delay r for conditions {19} (left) and d20l (right), respectively. 
To better illustrate the relationship between the gains and the time-delay, the maximum allowable 
delay in the plot has been cut off at 10 s. 

provided c\ and Kq satisfy the conditions in Theorem Q] Note that the satisfaction of 
(fT9l ) and d20l) is guaranteed by satisfying condition ([T8t in the theorem. 

The local uniform asymptotic stability of the equilibrium point (z e ,p e ) = of the 
closed-loop error dynamics (TT5t-dT7l) is then concluded. This means that the state es- 
timator converges to the reference trajectory, since z e (t) — > as t — > oo, or in other 
words, z(t) —> q r (t) as t — > oo. It also implies that the state estimator anticipates the 
system, due to the fact that p e (t) — > as t —> oo, i.e. z(t) —> q{t + r/) as t —> oo. 
From the previous relations it directly follows that q(t) — ► q r (t — r/) as t — > oo, which 
means that the unicycle-type mobile robot, subject to a network-induced delay r, tracks 
the reference trajectory delayed by Tf. This completes the sketch of the proof. 

The relationship between the allowable time-delay r and the control parameters for con- 
ditions dT9t and d20l) is shown in Figure The left plot shows the maximum allowable 
time-delay satisfying (fT9l ) considering p — 1 and different values for the correcting term 
gain K and for the maximum reference rotational velocity uo r . Depicted in the right plot 
is the maximum allowable time-delay satisfying d20l) given p — 1 and different values 
for the correcting term gain if<9. Note that, for both conditions, there exist choices for 
the correcting term gains such that it becomes possible to accommodate arbitrarily large 
time-delays (K — > and uo r — > for (fT9l ) and if# — > for d20l)). A word of caution 
is in order, however, since the plots also show that there is a performance tradeoff aris- 
ing from the relationship between the allowable time-delay, the correcting term gains 
and the tracking behavior. Namely, decreasing the correcting term gains allows higher 
robustness for delays at the expense of slower convergence. 



4 Experimental Results 

Two equivalent multi-robot platforms have been developed at the Eindhoven Univer- 
sity of Technology (TU/e), The Netherlands, and at the Tokyo Metropolitan University 
(TMU), Japan. The proposed remote tracking controller is implemented in such a way 
that a mobile robot located at TU/e is controlled from TMU and viceversa. 
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Fig. 4. Experimental setups at TU/e (left) and TMU (right). 

4.1 Experimental Platform Description 

The experimental platforms' design objectives encompass cost, reliability and flexibil- 
ity. The corresponding hardware and software choices, together with the implementa- 
tion of the setup at TU/e are discussed in greater detail in Q, (cf. Figure [4]). The 
setup has already been used to implement cooperation, coordination, collision avoid- 
ance and servo vision algorithms, see e.g. 1 10], |9|. The platform at TMU has similar 
characteristics, only differing from the one at TU/e in size and in the vision calibration 
algorithm used. Listed below are some of the experimental platforms' components and 
characteristics: 

- Mobile Robot. The unicycle selected is the e-puck mobile robot 1 1 1 ], whose wheels 
are driven by stepper motors that receive velocity control commands over a Blue- 
Tooth connection. 

- Vision. Each robot is fitted with a fiducial marker of 7 by 7 cm, collected by an 
industrial Fire Wire camera, interpreted in the program reacTIVision [17], and cali- 
brated by means of a global transformation (TU/e) or a grid (TMU). 

- Driving Area. The driving area is of 175 x 128 cm for TU/e and 100 x 50 cm for 
TMU, and is determined based on the required accuracy, the camera lens, and the 
height at which the camera is positioned. 

- Software. The e-puck robots and reacTIVision 's data stream can be managed in 
Matlab script, C, or Python. In this work, the controller implementation and signal 
processing is carried out in Python lfT6l . 

- Bandwidth and Sampling Rate. Using vision as the localization technique dimin- 
ishes the system's bandwidth and results in a sampling rate of 25 Hz. 



4.2 Data Exchange over the Internet 

Due to its widespread availability and low cost, the Internet is chosen as the communica- 
tion channel to exchange data between TU/e and TMU. Details about the data exchange 
implementation are given below: 
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- Data Exchange. A Virtual Private Network (VPN) is established between TU/e 
and TMU in order to implement a reliable and secure data exchange. 

- Socket Configuration. Data is exchanged between TU/e and TMU as soon as it 
becomes available using non-blocking Transmission Control Protocol (TCP) sock- 
ets running the Internet Protocol (IP). The system's low bandwidth allows for the 
use of TCP, guaranteeing reliable and orderly data delivery. 

- Data Payload. The variables exchanged are the following: the current time instant 
and control signals are sent from the control side to the system side, and the position 
and orientation measurements are sent from the system side to the controller side. 

4.3 Implementation Issues 

One of the main implementation issues of the proposed remote tracking controller is 
the accurate modeling and characterization of the time-delay induced by the commu- 
nication channel. The use of predictor-like schemes is often discouraged because of 
their sensitivity to delay model mismatches [5], especially when considering nonlinear 
systems and a communication channel such as the Internet. To this end, two differ- 
ent methods that ease the implementation of the proposed compensation strategy are 
suggested. Their objective it to provide an accurate estimate f of the real delay r in 
practice. The two delay estimation methods studied are explained below: 

- Delay Measurement. The round trip communication delay between TU/e and 
TMU (and vice versa) has been measured during different times of the day, for 
variable amounts of time ranging from 2min to lOmin, and for a total time of 
around 60min. The mean delay value is approximately 265 ms for both cases 
(267.4917ms TU/e^TMU, 269.5307ms TMU^TU/e). Occurrences of delays 
greater than 300 ms where of 0.27% for TU/e^TMU and 0.34% for TMU^TU/e. 
Thus, the round trip time-delay is fairly constant and can be modeled with enough 
accuracy even if the Internet is considered as the communication channel. 

- Signal Bouncing. The estimator's output may be sent together with the control 
signals to the mobile robot, and then sent back to the controller without being mod- 
ified. By using the communication channel itself to delay the estimator's output, 
modeling the time-delay is no longer necessary (cf. Figure 0) . 

4.4 Experiments 

In the first experiment, a mobile robot at TMU is controlled from TU/e. The reference 
trajectory is a lemniscate with center at (0.5 m, 0.25 m), a length and width of 0.2 m, and 
an angular velocity multiplier of 0.2 m/s. The scenario repeats in the second experiment, 
where a sinusoid with origin at (0.1m, 0.25 m), an amplitude of 0.15 m, an angular 
frequency of 0.3 rad/s, and a translational velocity multiplier of 0.01 m/s constitutes the 
reference. 

The system's initial condition is g(0) = [0.3235m 0.1882m 0.2851 rad] T for the 
first experiment and q(0) = [0.0225m 0.1821m 0.3916rad] T for the second one. In 
both cases the estimator's initial condition is set to z(0) = [0 0] T , the controller gains 
to c\ = 1.0, C2 = cs = 2.0 and the correcting term gains to K x = K y = Kq = —0.6. 
The sampling rate is 25 Hz and the experiments' duration is 60 s and 120 s, respectively. 
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Fig. 5. Remote tracking control strategy block diagram representation using signal bouncing (no 
time-delay models necessary). 



0.5 
0.4 
][0.3- 
0.2- 
0.1 - 



\ r \ 

\ I' \ 

\ A 

i / \ 

\ If \ r 



0.2 0.4 0.6 

X [m] 




Fig. 6. Reference (black solid line), robot (gray dashed line) and predictor behavior (light gray 
dotted line) in the X-Y plane for two different trajectories of a robot in Japan controlled from the 
Netherlands. 



The round trip time-delay is modeled as 265 ms based on measurements, although the 
estimator's output is in fact delayed 280 ms since only delay models which are multiples 
of 40 ms are allowed due to the setup's sampling time. 

The experimental results are shown in Figure and [7] for both experiments. The 
plots in Figure [6] show the reference (black solid line), robot (gray dashed line) and 
predictor (light gray dotted line) trajectories in the X-Y plane, with their initial and final 
position marked with a cross and a circle, respectively. The plots in Figure [7] show the 
evolution of the error coordinates z e (t) = [z\ e (t) Z2 e (t) zs e (t)] T (black) and p e (t) = 
[Pi e W Pz e (t) P3 e (t)] T (g ra y) ^ ov me ^ vs ^ ^^ second experiments (top and bottom, 
respectively). The error coordinates practically converge to zero even in the presence of 
a small delay model mismatch and considering a time- varying communication channel. 
The behavior of the proposed remote tracking controller is consistent with the stability 
analysis presented and the tracking performance of the robot can be ensured even in the 
presence of a network-induced delay. 
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Fig. 7. Practical convergence of the error coordinates z e (t) (black) and p e (t) (gray) for the first 
and second experiments (top and bottom, respectively). 

5 Discussion 

This paper considers the tracking control problem for a unicycle-type mobile robot con- 
trolled over a two-channel communication network which induces time-delays. A track- 
ing control and a state estimator that guarantee tracking a delayed reference trajectory 
has been proposed. Moreover, a stability analysis showing that the tracking and estima- 
tion error dynamics are locally uniformly asymptotically stable has been presented. In 
addition, experiments validate the effectiveness of the proposed approach and show that 
the estimator-control strategy can withstand small delay model mismatches and delay 
variations. 
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Abstract. This document presents a low-pressure servo-valve specifically de- 
signed for haptic interfaces and lightweight robotic applications. The device is 
able to work with hydraulic and pneumatic fluidic sources, operating within a 
pressure range of (0 — 50 • 10 5 Pa). All sensors and electronics were integrated 
inside the body of the valve, reducing the need for external circuits. Positioning 
repeatability as well as the capability to fine modulate the hydraulic flow were 
measured and verified. Furthermore, the static and dynamic behavior of the valve 
were evaluated for different working conditions, and a non-linear model identi- 
fied using a recursive Hammerstein- Wiener parameter adaptation algorithm. 

Keywords: Proportional valve, Hydraulic valve, Pneumatic valve, Mechatronics, 
Pressure control, Servo-mechanism. 



1 Introduction 

Hydraulically actuated robotic systems generally operate at pressures greater than 200 • 
10 5 Pa Q, 0. This is mainly due to the fact that it is convenient to increase the 
force/weight ratio of the actuation system by increasing its operational pressure |3|. 
If from one side increasing the pressure brings advantages, from the other side it could 
represent a limitation. First of all, the hydraulic components need to be designed to re- 
sist the high forces generated by the fluid pressure; this requires therefore to employ 
thick and heavy materials for pipes and actuators. Secondly, the usage of high pressure 
could also cause a dangerous situation for the operators that are in the proximity of the 
robot. The safety issue is even more critical if the robot, in our case an exoskeleton, is 
strictly coupled with the human being [4]. Any failure in the hydraulic system could 
seriously harm the user. 

One of the main goals of the VI-Bot project, under development at DFKI Bre- 
men (Robotics Innovation Center), is to design an intrinsically safe, wearable arm ex- 
oskeleton for Tele-Robotics applications 151617181 . As requirements the haptic inter- 
face should: enable the operator to control complex robotics systems in an intuitive 
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Fig. 1. a) A user wearing the exoskeleton b) The Exoskeleton kinematics equipped with 9 DOR 



way, implement a multi-points haptic feedback to increase the immersion into the work 
scenario, be lightweight and adaptable to different users, and integrate different lev- 
els of safe mechanisms. Furthermore, the kinematics architecture of the system should 
be designed in order to constrain as less as possible the natural arm movements and 
workspace. 

To achieve these goals and at the same time to reduce the complexity of the system 
(number of required DOF), it is necessary to keep the exoskeleton 's joints close to the 
human arm, "ideally overlap them with the human articulations" in order to avoid par- 
allel kinematic loops. It turns out that here the necessity is crucial to have a compact, 
light, highly dynamic actuation. The advantages of using hydraulically actuators to op- 
erate the exoskeleton's joints, if directly compared with classical DC motors, are repre- 
sented by their high force/weight ratio, the possibility to use the axes of the actuator as 
rotational/prismatic axes of the robotic system, and their back-drivability. Furthermore, 
with a proper hydraulic supply and a precise fluid regulation, strength and high dynamic 
range can be achieved [9], [ 10],[ 11 ]. 

FigureQ] shows the actual version of our haptic interface. In total there are 7 actuated 
joints: 5 located in the shoulder/upper- arm and 2 in the forearm. Two additional passive 
joints (8 and 9) allow the wrist supination-pronation. All active joints are hydraulically 
actuated, valves, sensors and electronics are intended to be mounted directly in prox- 
imity of the actuators, this in order to reduce the amount of cables and pipes needed. 
The hydraulic pump and the primary power supply are located outside the exoskeleton 
to avoid additional weight to the system. 

A central element within the hydraulic system is represented by the proportional 
servo- valve. On the market there are plenty of proportional (4/3) hydraulic valves, the 
problem is that most of them are intended to work with high pressure and therefore do 
not fulfill our needs. According to the authors' knowledge, the smallest, lightweight and 
dynamically performing valve on the market is currently sold by MOOG Inc. company 
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[12]. The device weights only 92g, has an hysteresis for the flow characteristics < 3%, 
and a 90° phase-lag > 250Hz. Unfortunately, the device is intended to work only with 
pressure in the range of 160 — 250 • 10 5 Pa. 

Therefore we started to look at the pneumatic components, that generally are light 
and designed for low pressure (up to 10 • 10 5 Pa). We adapted them to work with liquids 
(oil), adding a precise actuation and proper sensory features. 

This document is organized as follows: next section describes the experimental setup 
employed to measure the repeatability and flow-position characteristic of the valve, sec- 
tion[3]presents the static and dynamic models, section|4]introduces a strategy to regulate 
position and velocity of the valve's spool, section Opresents a first fully integrated pro- 
totype. Finally, section [6] draws the conclusions and future developments of this work. 

2 Experimental Setup and Testing 

In this section, the testbed developed to evaluate the performance of the servo valve and 
first experimental results characterizing the valve are presented. 

The experimental setup (Fig. [2]) consists of the core parts of a commercial pneumatic 
valve (Numatics Inc. series Micro- Air), a stepper motor, a gear pump providing pressure 
supply between — 30 • 10 5 Pa, a flow-meter, a pressure sensor, and an electronic board 
equipped with a STM32 //Controller (ST-Microelectronics Inc.). 

The drive system of the valve is a 3.3 V DC stepper motor from Nanotec working 
in full-step mode, i.e. 18 degree/step and with an holding torque of 1.6 • 10 -3 Nm. The 
rotor of the stepper motor is a lead screw, driving a cylinder and thus converting the ro- 
tational motor movement into a translation. The drive is attached to the valve spool via 
a permanent magnet, while the actual valve positions are determined using an inductive 
sensor from Bahllufflnc. by tracking a steel target connected to the extended spool axis. 
Control of the testbed and its components is performed by a STM32 //Controller (se- 
ries F103VE), programmed with a special toolchain consisting of Matlab/Simulink 
and Rapidstm32 Blockset, Real-Time- Workshop and Keil Microvision \i Vision. The 
scheme in Figure [3] sketches the general dependencies of the testbed, were A and B are 
the connections of the valve to the actuator chambers, P s and T are the pressure supply 
line and the tank lines of the gear pump respectively, while the red lines represent the 
communication between the //Controller and the experimental setup through sensors 
and actors. 

Via different tests we measured: 

1 . repeatability of the spool movement with respect to a certain input to the drive 
system, 

2. flow through the valve with respect to spool position and the pressure drop over the 
valve. 

Thus we focus on showing exemplary results of these two features. Position control of 
the fluidic valve is presented instead in section [4] 
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2.1 Performance of the Drive System 

In the following, the ability of the valve's drive system to exactly position the spool 
over a long time interval is investigated. Therefore, the repeatability of valve movement 
is tested by applying a special open-loop control sequence. In the adjustment phase, the 
spool is driven to the center position, while in the second phase the stepper motor is 
governed to move the spool 70 steps out of the zero position in both directions, which 
covers almost the whole working range of the developed prototype. Experiments took 
place under influence of pressure with P s = 15 • 10 5 Pa. Note that in this experiment 
the connectors A and B are connected in short circuit, while the speed of the motor is 
adjusted to its maximum of 1000 steps per second. 

Figure |4] shows the open loop response of the valve prototype to the applied control 
sequence. The duration of the test was 20 minutes, while data was acquired for a time 
interval of 20 seconds every five minutes. 

As we can determine from Figured the position response of the drive system to the 
reference signal is matching for the whole experiment. Differences for the travel of the 
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Fig. 4. Position of spool while performing 70 steps out of center in both directions (P s = 15 • 
10 5 Pa). 

spool, comparing the movement out of the center in both directions, might be caused 
by friction effects. The hysteresis is due to a backlash of approximately 10 steps (i.e. 
0. lmm) between the spindle of the stepper motor and the thread of the cylinder moving 
the spool, when the motor changes its direction of rotation. This is due to a mechanical 
behavior and can only be reduced through higher precision in the manufacturing process 
of these two elements, or via a proper control action. 

Alternatively a special ball- screw could be used, which provides nearly zero back- 
lash. Overall the valve shows remarkable repeatability for an open loop control of spool 
position under influence of pressure for a long time interval. 

2.2 Flow Characteristics of Valve 

In this section, the resulting static flow characteristics of the valve is presented. 

To measure the flow for a fixed pressure drop over the valve, the connectors A and 
B are again linked in a short circuit, causing the valve to work against the pressure in 
the tank line T. A flowmeter from Biotech (series VZS-007-ALU) is connected to the 
pressure supply line providing a resolution of 900 pulses/L at a maximum flow of 5 
L/min to the digital I/O of the /iController. Due to the fact that flow needs a certain 
time to become constant, the valve is driven at a very low speed of 1 step/2s through 
the overall working period. Simultaneously, the average flow is calculated and sampled 
each second. 

Fig(5] shows the static flow characteristics of the hydraulic valve for a constant pres- 
sure drop of AP = 29 • 10 5 Pa. 

From Figure \5\ we can determine that the valve has a large deadband of approx- 
imatively 1.25 mm (between 0.45 mm and 1.7 mm) where it is completely closed. 
This is caused by the inner structure of the valve which is adapted from a pneumatic 
solenoid valve and can neither be influenced nor changed at the moment. Therefore, 
this deadband has to be taken into account in the position control structure presented in 
section [4] 

Having a closer look at the slopes of the flow characteristics, an area of about 0.2mm 
can be attested where flow regulation should be possible. Driving the spool to the 
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Fig. 5. Flow-characteristics of the valve with respect to the spool position (AP — 29 • 10 5 Pa). 

extremes causes flow saturation (U a < 0.2 mm and 1.9 mm < U a ), thus the overall 
working range of the valve prototype can be defined to 

0.15 mm <U a < 1.95 mm, 

whereas the amplitude of the flow depends on the pressure drop over the valve. 

The exemplary results shown in this section certify a good repeatability to the drive 
system of the developed prototype. Furthermore, the static flow characteristic promises 
a possible flow regulation within the defined working range. 

3 Valve Model 



In order to formalize a complete model of the proportional valve, it was divided in a 
static and a dynamic part. The static part, describing the position-flow-pressure rela- 
tionship in the valve, is reported in the next section. The dynamic (electro-mechanical) 
part is described in section [3T2l 



3.1 Static Model 

The position-flow-pressure relationship of a hydraulic proportional valve can be de- 
scribed by the following equation: 



Q = k v * x v * v|AP| * sign(AP) 



(1) 



where Q is the flow [L/min], x v the position of the spool in the valve [m], AP the 
pressure difference in the valves chambers [Pa], and k v the flow factor. 

From our experimental setup, known (measured) values are Q,x w ,AP, so it re- 
mained to calculate the factor k v . So from equation[T]we know that 



/c 7 , — 



x v *y/\AP\*sign(AP) 



(2) 



The flow factor k v is defined for liquids as the flow of water with temperature ranging 
5 — 30° C through a valve in cubic meters per hour (m 3 /h) with a pressure drop of 1 bar. 
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Fig. 6. Values of the flow coefficient k v for pressure values between 5 and 20 Bar. 

The metric equivalent of the flow factor is also in common use and is based on imperial 
units (mainly used in the United States of America) and is called the flow coefficient 
c v . In other words, and explained in a simpler way, the flow factor describes the flow 
capacity of the valve at fully opened position. 

The used pressure signals AP had values between 5 and 20 Bar (50 • 10 5 and 200 • 
10 5 Pa), the measured flow signals were filtered with a Butterworth low-pass filter with 
a band frequency of 3 Hz. Figure[6] shows the values of k v graphically for all data series, 
calculated after equation [2] 

Now k v can be estimated as the maximum value of the graph. The factor reached a 
value of k v = 3.398. With the caculation of the flow factor, the static part of the model 
is fully defined. 

3.2 Dynamic Model 

In this section, a non-linear dynamic model that takes into account the electro-me- 
chanical behavior of the stepper motor, the mechanical behavior of the spool and the 
static/dynamic effects of the frictions present within the system, was identified using 
a Hammerstein- Wiener structure. The linear part of the model can be considered as an 
ARX structure, which is well known and described amongst others in fl3l . For the input 
and output blocks, polynomial nonlinearities with order n\ were used: 



rj(u) = J2(3 k u k 



v(y< 



'lin) 



k=2 

ni 

k=2 



(3) 



(4) 



with yiin as the output of the linear part. 

Model parameters were calculated using the Recursive Least Squares(RLS) Parame- 
ter Adaptation Algorithm (PA A). Using the equation for a Hammerstein- Wiener model's 
output 1 14], the structure of the PAA becomes, with u(t) and y(t) being the real input 
and output signals of the system: 



§(t + 1) = 6{t) + F(t + l)0(t)e°(t + 1) 



(5) 
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Fig. 7. Block diagram of a non-linear Hammerstein- Wiener model. 



with the Adaptation Gain 

F(t+l) = F(t) 
and the Prediction Error 



i + 4>(t) T F(t)4>(t) 



(6) 



(7) 



e\t+l) = y{t+l)-e{t) T 4>{t)- 
6 is the vector of computed parameters, with 

e(t) T = [a(t) T Mt) T J(t) T ,i(t) T ], (8) 

where a(t) T = [ai(t)...a na (t)] are the parameters of polynomial A with order n a , 
b(t) T = \bi(t)...b nb (t)] the parameters of polynomial B with order rib, $(t) T — 
[$i(t)...$ ni (t)] the parameters of the input non-linearity with order ni, j(t) T = 
[71 (£)... % t (t)] the parameters of the output non-linearity with order n\. 
Furthermore <fi(t) is the Predictor Regressor Vector 



</>(ty = [-y(t) : u(t),u(t)\^u(tr^y lin (ty,... : y lin (tr}. 



(9) 



The output of the linear part yu n cannot be measured, nevertheless it can be calculated 
by multiplying the parts of the predictor and the parameters vector corresponding to the 
linear model: 



yiin(t + 1) = [1 : n a + n b + m] (t)(/> [1 : n a + n b + m] 
Finally, the model output y is computed 

£(£+1) = 0> + 1) T (t5(£) 



(10) 



(11) 



The measured signals used for identification are the current absorbed by the stepper 
motor %m as input signal, and the position of the valve's spool xs in terms of mm as 
output. In order to characterize the dynamic behavior of the valve we only considered 
the range of spool positions where the flow can be effectively regulated. As a first step 
the data was filtered using a bandpass filter allowing frequencies between 0Hz and 
60Hz, which does not affect the dynamic range of the model. Different models were 
identified starting from distinct initial values for the parameter vector, the best data 
fitting reached an average of 87.49%. The output of the obtained model can be seen in 
Figure [U whereas the transfer function of the linear dynamic part is given by: 



G(z) = 



-0.16752-0.1039 



0.8512z 2 - 0.135Lz - 0.03855 



(12) 
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Fig. 9. Magnitude Bode Plot of the linear transfer function. 



with the sample time 0.01s. The Bode diagram of G(z) is shown in Figure [9] Finally, 
the static input non-linearity of the model was identified to: 



r)(u) = 0.59831 * u 2 - 0.00047669 * u 3 , 
while the static output non-linearity of the model is given by: 

Viviin) = 0.030651 * yf in - 0.042948 * yf z 



(13) 



(14) 



4 Valve Control 

This section introduces a first approach to control the developed fluidic valve. 

To facilitate precise position control of the valve at a high bandwidth, the drive sys- 
tem is to be controlled in closed loop, using the feedback of the inductive sensor to 
track the desired trajectories given by the /iController. Therefore, classical PID-control 
in combination with a discretization of stepper motor motion is applied. 

A basic requirement for position control is the ability of the drive system to run at 
different speeds. 
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Fig. 10. Scheme of discretized closed loop valve control. 

Due to the fact that the stepper motor is somehow a digital drive, which can only 
have the states run or stop, a discretizing function is introduced to the control loop. The 
idea is to vary the number of samples passing between two step commands by defining 
a variable delay factor r^, which causes a step command only every Td*T s , with T s as 
basic sampling time of the system. 

Thus an increase of r^ results in a slower motion of the valve spool, because less 
steps are performed by the drive system in a fixed time interval. 

The value of r^ is set by the control action u of the PID-controller, tracking the 
desired spool position p^ as follows 



Td 



bl- 1000 



u < 0.013 
u > 0.014 



(15) 



Figure [K)| shows the resulting scheme of the closed loop control. 

Whereas Pmeas represents the measured and pd the desired spool position, the dis- 
cretizing function can be found between the PID-controller and the valve driver. 

To reject noise from the position measurement, a 1 st order Butterworth low-pass 
filter is realized via software in the /iController. The following Figure Qj] shows the 
modulation of the valve speed through the discretizing function. 

To verify the functionality of the discretizing function, the valve is governed to ex- 
ecute a velocity sweep. Starting from r^ = 100 the delay factor is decreased by an 
amount of five every 5 ms until the maximum speed at r^ = 1 is reached. After 




Fig. 11. Speed modulated fluidic valve, blue dashed: valve position, green: velocity. 
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time (s) 

Fig. 12. Step-like reference tracking at (a) f=lHz, (b) f=3Hz, where the green line represents the 
reference. 




Fig. 13. Step-like reference tracking at (a) f=lHz, (b) f=3Hz, where the green line represents the 
reference. 



reaching the final position, the direction of movement is changed and r^ is reset to 
its starting value. It is obvious that the speed of the valve is increased while the delay 
factor decreases and vice versa. 

To test the overall performances of the developed control scheme, sinusoidals with 
different frequencies as well as step-like reference trajectories were fed to the control 
system as reference signals. The parameters of the PID controller have been determined 
through classical Ziegler-Nichols method, whereas the critical controller gain and the 
critical oscillation period respectively were identified as K P}Crit = 1.167 and T crit = 
0.06 s. These results in the following controller gains 



K p = 0.7, Ki= 0.03, K d 



0.008. 



Figure [T3l exemplary presents the tracking results of the position control loop for step- 
like reference signals at 1 and 3 Hz. 

As we can extract from the Figure [131 the controller tracks step-like signals without 
overshoot up to an actual maximum frequency of f=3Hz. Due to the fact that this test is 
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Table 1. Characteristic values for sinusoidal reference tracking. 



/in [Hz] 


e Pm ax in [V] 


<P in [deg] 


0.5 

1 
3 


0.028 
0.032 
0.11 


0.81 
1.8 
3.24 



performed in the maximum working range, a better dynamic response can be expected 
for smaller movements. 

Finally, Table[T]sums up the characteristic values for a sinusoidal reference tracking, 
where / is the frequency of the reference, e Pmax the maximum error in position, and <p 
the phase shift between the two signals. 

5 The Valve Prototype 

After the choice of the proper hardware and electronic components, a new valve was 
designed and realized (Fig. [T4h ) using rapid prototyping technique. The final device 
(Fig. [T4b ) has a volume of LxWxH=(60mm)x(20mm)x(40mm), and has a weight of 
106g including one position and two pressure sensors, an amplifier board, the stepper 
motor, and the electrical/hydraulic connectors. 

In this first prototype the PWM motor driver and the control logic are located in a 
separated device, nevertheless future versions may also include these components on- 
board. This will reduce the amount of required cables to only a single power and a data 
line (e.g. a CAN-BUS). Compared with other state of the art hydraulic valves lfT2l . the 
one presented here is designed to work with relatively low pressures both for hydraulic 
and pneumatics purposes, it is extremely compact and lightweight, furthermore it inte- 
grates two pressure sensors that are directly connected to the two output lines A and B 
(see schema in figure[2]). This allows a fine tuning of the pressure inside the two actuator 
chambers, and therefore enables a precise control of the generated torque. 



Position Sensor 

Pressure Senso 








Actuation r. 
(Stepper Motor) 



V. 

Lines to Actuator 



Fig. 14. a) Valve developed using rapid prototyping technique b) Final valve in aluminum with 
all integrated components. 
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6 Conclusions and Future Developments 

This document presented the development of a new fluid servo- valve. The device is 
specifically intended for lightweight robotic systems and designed to work in a op- 
erational range of — 50 • 10 5 Pa. The work is motivated by the fact that, accord- 
ing to the authors knowledge, no commercial valve exists for precise low pressure 
hydraulic actuators control. As general requirements, compactness, lightweight, and 
high dynamics were considered during the design process. A first series of experiments 
have been performed to test repeatability, flow-position characteristics and dynamic re- 
sponse. A model of the drive system of the servo-valve was identified using a recursive 
Hammerstein- Wiener parameter adaptation algorithm. The combination of a linear and 
dynamic part with a non linear static component let to reach a fit of 87%. Finally to test 
the overall functionality of the valve and to measure its step response characteristics, a 
proper control algorithm was implemented that allows to regulate the position and the 
velocity of the valve's spool. 

Future work will be devoted to identify the overall model that will explicitly define 
the position-flow-pressure relationship. The backlash and dead-band problems need 
to be properly addressed. In particular the employment of a ball-screw for the roto- 
translation mechanism of the valve, instead of a normal lead screw , will improve the 
precision in controlling the position. Furthermore, with a customized design of the 
spool, will be possible to decrease the switching time between the two opening po- 
sitions, and therefore to improve the dynamic behavior of the servo- valve. 

Acknowledgements. The work presented here was done within the VI-Bot project, 
funded by the German Ministry of Science (BMBF), [grant number 01IW07003]. 
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Abstract. This paper introduces a novel approach to learn representations of ob- 
jects using a team of robots. Each robot extracts local and global visual features 
of objects and combines them to represent and recognize objects. Contrary to 
previous approaches the robots do not know in advance the number or nature of 
objects to learn. Individual representations of objects are learned on-line while 
the robots are traversing an environment. Robots share their individual concepts 
to improve their own concepts, and to acquire a new representation of an object 
not seen by them. For that, the robots have to detect if they are seeing a new ob- 
ject or an already learned one. We empirically evaluated our approach with a real 
world robot team with very promising results. 

Keywords: Robotics and automation, Mobile robots and autonomous systems, 
Vision, Recognition and reconstruction, Network robotics. 



1 Introduction 

Collective robotics is a very active research area in the robotics community. Multi- 
robot systems or robot teams have effectively emerged as an alternative paradigm for 
the design and control of robotic systems because of the team's capability to exploit 
redundancy in sensing and actuation. 

The research on collective robotics has focused on developing mechanisms that en- 
able autonomous robots to perform collective tasks, such as strategies for coordination 
and communication [1 2|; exploration, mapping and deployment |3|; sensing, surveil- 
lance and monitoring [ 4 1 ; and decentralized decision making [ 5 ] . In these works, a robot 
team can reduce time to complete a complex task that is allocated among its members. 

Despite constant research on the design of robot teams, very little attention has been 
paid so far to the development of robot teams capable of learning from their interaction 
with their environment. In addition to their capability for accelerated learning, learning 
robot teams can be used to acquire a much richer and varied information compared to 
the information acquired by single learning robots. 

Learning is a key issue to achieve autonomy for both, single robots and robot teams. 
Learning capabilities can provide robots the flexibility and adaptation needed to cope 
with complex situations. In the context of robot teams, the most common machine learn- 
ing approach has been reinforcement learning, where the idea is to learn optimal policies 
using a set of robots to improve the coordination of individual actions in order to reach 
common goals I1I2I4I61 . 
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In this work we use visual information to learn, with a team of robots, descriptions of 
objects placed in a particular environment. Learning to recognize particular objects in an 
environment is important for robotics as it can be used for local and global localization 
tasks as well as for simple service tasks such as searching for objects in unknown places. 
Contrary to previous approaches, in our learning setting, the robots are not told the 
number or nature of the objects to be learned. 

Vision is a primary source of perception in robotics and provides different features 
that can be used to classify objects. In general, using a particular set of features can 
be adequate for particular tasks but inadequate for other tasks. In this work, objects are 
characterized by two complementary features: (i) SIFT features |7] and (ii) information 
about the silhouettes of objects. Other features could be used as well, but the main ob- 
jective in this work is to show the different cases and possible confusions that can arise 
in the recognition of objects and merging of concepts, and how they can be addressed. 

This article deals with the individual and collective representation of objects from 
visual information using a team of autonomous robots. 

The rest of the paper is organized as follows. Section [2] reviews related work. Sec- 
tions [3] and |4] introduce, respectively, the stages of individual learning and collective 
learning of concepts. Section \5\ describes our experimental results, and Section 0pro- 
vides conclusions and future research work. 



2 Related Work 

Interesting experiments where physical mobile robots learn to recognize objects from 
visual information have been reported. First we review significant work developed for 
individual learning, and then we review learning approaches developed for robot teams. 

In (U different learning techniques to acquire automatically semantic and spatial in- 
formation of the environment in a service robot scenario are applied. In that work, a 
mobile robot autonomously navigates in a domestic environment, builds a map, local- 
izes its position in the map, recognizes objects and locates them in the map. Background 
subtraction techniques are applied for foreground object segmentation. Then objects are 
represented by SIFT points |7 ] and an appearance-based method is used for detecting 
objects named Receptive Field Co-occurrence Histograms. The authors developed a 
method for active object recognition which integrates both local and global information 
of objects. 

In [ 9 ] authors applied an instance-based method to train a robot for object recognition 
purposes. The objects are represented by color histograms. Different representations are 
learned from different views of the same object using a mediator that can ask questions 
and provide feedback on the robot's answers. The recognition is performed by classify- 
ing new views of objects using the KNN algorithm [10]. 

In ifTTIl . a scheme for fast color invariant ball detection in the RoboCup context is 
presented. To ensure the color-invariance of the input images, a preprocessing stage is 
first applied for detecting edges using the Sobel filter, and specific thresholds for color 
removal. Then, windows are extracted from images and predefined spatial features, such 
as edges and lines, are identified in these windows. These features serve as input to 
an AdaBoost learning procedure that constructs a cascade of regression tree classifiers 
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(CART). The system is capable of detecting different soccer balls in RoboCup and other 
environments. The resulting approach is reliable and fast enough to classify objects in 
real time. 

In (12 ] authors present an on-line method for learning objects for human-robot in- 
teraction. In this case, the robot's user is included in the learning framework as an 
instructor. For the learning method, authors used a lifelong incremental learning system 
that evolves with any new learned object based on one-class learning (OCLL). Objects 
are represented using normalized shape features. Aditionally, the authors proposed an 
experimental methodology for evaluating a word learning method, and for comparing 
the word learning capabilities of different agents as well as to asses research progress 
in similar works. 

Concerning the problem of collective learning of objects using robot teams there 
are, as far as we know, very few works. In 1 13] the authors address the problem of 
mobile object recognition based on kinematic information. The basic idea is that if the 
same object is being tracked by two different robots, the trajectories and therefore the 
kinematic information observed by each robot must be compatible. Therefore, location 
and velocities of moving objects are the features used for object recognition instead 
of features such as color, texture, shape and size, more appropriate for static object 
recognition. Robots build maps containing the relative position of moving objects and 
their velocity at a given time. A Bayesian approach is then applied to relate the multiple 
views of an object acquired by the robots. 

In (Ml, objects are represented with Principal Components (PC) learned from a set 
of global features extracted from images of objects. An object is first segmented and its 
global features such as color, texture, and shape are then extracted. Successive images 
in a sequence are related to the same object by applying a Kalman filter. Finally, a 
3D reconstructed model of an object is obtained from the multiple views acquired by 
robots. For that purpose, a Shape From Silhouette based technique |T5l is applied. 

The main drawbacks of previous works can be summarized as follows: (i) Most 
approaches are able to cope with a limited number of learning objects, usually 3 to 
12 objects, and (ii) the number of learning objects is known a priori. In contrast to 
previous works, in our method each robot learns on-line individual representations of 
objects without prior knowledge on the number or nature of the objects to learn. Indi- 
vidual concepts are represented as a combination of global and local features extracted 
autonomously by the robots from the training objects. A Bayesian approach is used 
to combine these features and used for classification. Individual concepts are shared 
among robots to improve their own concepts, combining information from other robots 
that saw the same object, and to acquire a new representation of an unnoticed object. 
We also analyzed and provide solutions to the different cases that can occur from infor- 
mation of two individual concepts. 

3 Individual Learning of Concepts 

The individual concepts are learned on-line by a robot team while traversing an envi- 
ronment without prior knowledge on the number or nature of the objects to learn. The 
individual learning of concepts consists of tree parts: object detection, feature extrac- 
tion, and individual training. 
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Individual concepts of objects are represented by Principal Components (PC) over 
the information about the silhouettes of objects and by Scale Invariant Features (SIFT). 
Learned concepts are shared among robots. 

3.1 Object Detection 

Robots move through an environment and learn descriptions of objects that they en- 
countered during navigation. Objects are detected using background substraction. In 
this paper we assume a uniform and static background. We performed morphological 
operations (closing and erode) to achieve better segmentation. Once an object is de- 
tected, it is segmented and scaled to a fixed size, to make the global PC features robust 
to changes in scale and position. 

3.2 Feature Extraction and Individual Training 

The segmented objects are grouped autonomously by the robots in sets of images con- 
taining the same object. Robots assume that they are observing the same object while it 
can be detected, and they finish to see it when they can not detect objects in the captured 
images. Only one object can be detected in an image at the same time. For each set of 
images, the robot obtains an individual concept that represents the object. 

Training Using Global Features. We applied Principal Component Analysis (PCA) 
over the average silhouettes that are automatically extracted from the set of images of 
a particular object. The average provides a more compact representation of objects and 
reduces segmentation errors. Figure [T] (a) shows an image of the object vase used in the 
training phase, FigureQ](b) shows its silhouette, and FigureQ](c) illustrates the average 
silhouette obtained from a set of images that represent the object of Figure Q] (a). Once 
the robot has obtained an average silhouette, this is added by the robot to a set of known 
average silhouettes. After that, the robot uses PCA to reduce the dimensionality of all 
average silhouettes learned to get the PC features that represent them. 

Training Using Local Features. Each robot extracts local SIFT features of each 
image of the set of images, and groups them in a final set which contains all the different 
SIFT features that represent an object. In Figure [I] (d) we show an example of the SIFT 
points obtained from a set of images of a vase and the final set of SIFT points obtained. 
The PC features and the SIFT features represent the individual concept of the observed 
object. 

3.3 Sharing Concepts 

The concepts learned by robots are shared among them to achieve collective learning. 
This can be done off-line or on-line. In the case of collective off-line learning the robots 
share their individual concepts once they have learned all the training objects. On the 
other hand, in the collective on-line learning the robots share their individual concept 
as soon as a new object is learned. 
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(d) 

Fig. 1. Examples of the silhouette (b) and average silhouette (c) of an object (a). Examples of the 
SIFT features extracted from a set of images and the final set of SIFT features (d). 



4 Collective Learning of Concepts 

Collective learning of concepts enables robots to improve individual concepts com- 
bining information from other robots that saw the same object, and to acquire a new 
representation of an object not seen by them. Therefore, a robot can learn to recog- 
nize more objects of what it saw and can improve their own concepts with additional 
evidence from other robots. 

A robot has to decide whether the concept shared by another robot is of a new object 
or of a previously learned concept. A robot can face three possibilities: coincident, 
complementary or confused information. The shared concepts are fused depending on 
the kind of information detected, as described below. 



4.1 Pre-analysis of Individual Concepts 

The concept learned by a robot is defined as follows: 

Cl = {Sill SIFT*} (1) 

where C k is the concept k learned by robot i 9 Sil l k is the average silhouette, and SIFT k 
is the set of SIFT features that form the concept k. 

In order to determine if a shared concept is previously known or not to a robot, it 
evaluates the probabilities that the PC features and SIFT features are previously known 
by the robot. The probability vectors of PC features calculated by robot i,v l P , indicate 
the probability that a concept shared by robot j, C k , is similar to the concepts known 
by robot i, Cf , . . . , C l nurn0h -, given the global features. numObjs is the number of 
concepts of objects known by robot i. The process to obtain the probability vector PCA 
is described as follows: 

- A temporal training set of silhouettes is formed by adding the average silhouettes 



,Sill 



numObjs' 



and the average 



of concepts known by robot i or actual robot, Sil\, ..., 

silhouette of the new or shared concept Sil 3 k . 

- The PCA is trained using the temporal set of average silhouettes. The projection of the 
average silhouettes know by robot i is obtained as a matrix of projections, matProys. 
The projection of the average silhouette Sil 3 k is obtained in a vector, vectProys. 
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- The Euclidean distance (dE) is calculated between each vector of the matrix 
matProys and the vector vectProys as shown in formula^ i.e, we obtain the distance 
between all the projections already computed and the projection of the new silhouette. 



dE\ 



\ 



nEigens 

y (matProysy^ — vectProys^^) 2 (2) 

r=l 



where nEigens is the number of eigenvectors used during the PCA training 
(nEigens = numObjs 1 — 1), and I is the index of the distance vector, where the 
maximum size of the vector dE 1 is numObjs 1 . 

- The distance value dE 1 is divided by a maximum distance value, ThresholdMax, de- 
termined experimentally to obtain a similarity metric also called the probability vector 
PCA, Vp, as shown in formula[3] 

v i = 1 dE l (3) 

1 ThresholdMax 

If dE\ is bigger than the ThresholdMax value, then the probability will be fixed as 
shown in formula |U which indicates that the projections of the object k and the one of 
the object / are completely different. 

' (4) 



1 numObjs 

The value of the SIFT similarity metric also called the probability vector SIFT at the 
position v l s , is obtained calculating the number of coincident SIFT points, n CO i n , be- 
tween the individual SIFT concept SIFTf learned by robot z, and the individual SIFT 
concept SIFT£ shared by robot j. If the number n CO i n is bigger than an average of 
coincidences determined experimentally, AverageCoin, then the probability will be 
fixed to v l s =1.0, which means that both concepts contain the same local SIFT fea- 
tures. Otherwise, the probability will be calculated using formula[5] 

i n coin s~^ 

1 AverageCoin 

The constant AverageCoin represents the average of coincidences between two sets 
of SIFT points of the same object from different perspectives. 

4.2 Analysis and Fusion of Individual Concepts 

This section describes how detection is achieved if the new or shared concept is one of 
the following: coincident, complementary or confused, and how the individual concepts 
are fused to form collective concepts depending on the kind of detected concept. 

Coincident Concepts. A coincident concept is detected when one, two or more robots 
of the robot team learned individual concepts from similar views of the same object. A 
new or shared concept is classified as coincident if v l P > a and v l s > a. That is, if 
both probabilities (PCA and SIFT) of a previously learned concept are greater than a 
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predefined threshold value (a). If a new or shared concept is determined as coincident 
it is merged with the most similar known concept as follows: 

PCA Fusion. It is obtained by evaluating a new average silhouette from the average of 
the known Sil\ and new Sil 3 k silhouettes. After that, it is necessary to re-train the PCA 
substituting the concept Sil\ with the new average silhouette which contains informa- 
tion of the concept learned by robot j. 

SIFT Fusion. It is obtained by adding the complementary SIFT points of concept 
SIFT k to the set of SIFT points of concept SIFTf. Also, each pair of coincident 
SIFT points of both concepts is averaged in terms of position and their corresponding 
SIFT descriptors. 

The main idea to fuse coincident concepts is to improve their representation. 

Complementary Concepts. A concept C k contains complementary information if it 
differs with all known concepts by robot i 9 i.e., if both shape and local features are 
different to all known concepts by robot i, C{, . . . , C l nurn0h - S . That is, ifv l P < a and 
v l s < a. 

A complementary concept C k is fused with the collective concepts known by the 
robot i as follows: 

PCA Fusion. The new average silhouette is added and the new PC features are obtained 
by re-training the PCA using the updated set of average silhouettes. 

SIFT Fusion. The new SIFT features are simply added to the current set of SIFT con- 
cepts known by the robot i. 

Confused Concepts. There are two types of confusion that can occur between concepts: 

Different Shape and Similar Local Features, (type 1): This type of confusion occurs 
when the new concept C 3 k is complementary by shape, Sil 3 k , to all the concepts known 
by the robot i, Sil\, ..., Sil l nurn0b -i but it is coincident by local SIFT features, SIFT k , 
with at least one concept known by the robot i. That is, v l s > a and if v P < a. 

Similar Shape and Different Local Features, (type 2): This type of information occurs 
when concept C k is coincident by shape, Sil 3 k , to at least one concept known by the 
robot i 9 but it is complementary using its local SIFT features, SIFT k . That is, if v l P 



> 



a and v l s < a. 

In both types of confusion, type 1 or type 2, there can be two options: 

a) Different Objects. Both concepts correspond to different objects. 

b) Same Object. Both concepts correspond to the same object but they were learned 
by robots from different points of view. 

In our current approach, both types of confusions are solved as complementary objects. 
The reason is that robot i cannot distinguish with its current information between both, 
different objects or same object, using only the individual and the shared concepts. 
To solve the ambiguity, as future work each robot should build autonomously a map 
and locate its position in the map. In addition, for each learned object, robots will 
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Fig. 2. The robot team 

locate them in the map. For confused objects a robot can move to the position of the 
object marked in the map to see the object from different perspectives in order to solve 
the conflict. 



5 Experiments and Results 

We performed several experiments to demonstrate the proposed algorithm. In section 
15.11 we show the results of a general experiment that demonstrates the main features of 
the proposed approach. In section [5^2l we present the accuracy of the collective concepts 
versus the individual concepts. 

In these experiments we used a robot team consisting of two homogeneous Koala 
robots (Figure [2]) equipped with a video camera of 320 x 240 pixels, and an on board 
portable computer of 1 GB of RAM memory for the processing. For more than two 
robots our method can be applied straightforward. The only difference is that robots 
will need to consider the information from more than one robot, possibly reducing 
confused concepts. 



5.1 Concept Acquisition and Testing 

The mobile robots learn on-line representations of several objects while following a 
predefined trajectory without prior knowledge on the number or nature of the objects to 
learn. The idea of using pre-planned trajectories instead of making the robots wandering 
randomly, is that we can control the experimental conditions to show different aspects 
of the proposed methodology. 

Each robot shares its individual concept as soon as it is learned to improve the repre- 
sentation of this concept or to include a new concept in the other robot. Figure [5] shows 
the training objects used in this experiment. As can be seen in the figure, some objects 
have the same shape but different texture, some have the same texture but different 
shape, some others are not symmetric in their shape. The objective of this experiment 
is to show the performance of the system to detect coincident, complementary and con- 
fused information under a wide variety of conditions. 

In Figure [4] we present how the training objects were placed in the environment, and 
how the robots moved in the environment to see the different training objects. Robot 
1 (Rl) learned during individual training concepts for: dolphin, can, water bottle and 
vase. Robot 2 (R2) learned individual concepts for: vase, soda bottle, bottle and cone. 
Note that some objects are learned by both robots while others are only learned by one 
robot. 
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(a) 

Fig. 3. Training objects, (a) vase, (b) water bottle, (c) can, (d) dolphin, (e) soda bottle, (f) bottle 
and (g) cone. 




Fig. 4. Concept acquisition 



While learning a new concept, each robot has to decide whether to fuse the current 
concept with a previously known concept or include it as a new one. TableQ] shows the 
probability vectors of the PC A features based on shape (vp) and of the SIFT features 
(v~g) obtained by Robot 1. A similar table is obtained by Robot 2 (for details see lfT6l ). 
In Table Q] the coincident information is represented in bold. 

We used the defined criteria in Section [4~2l to recognize coincident, complementary 
or confused concepts, with a = 0.65 as threshold value, and the probability vectors of 
Table[Q 

For instance, in column labeled with PCA of Table \T\ it is shown how the probabi- 
lities of objects of Rl are affected using only PCA over the shapes of objects, as both 
robots encounter and learn concepts while traversing the environment. In the first row, 
Rl learns about the concept dolphin and acquires it. In the second row, R2 then learns 
about vase and shares this concept to Rl (that is expressed with a subindex R2). The 
probability, according to the PCA features to be a dolphin is 0.19 (second row). Rl 
learns the object can, which has a probability of 0.31 to be a dolphin and a probability 
of 0.26 to be a vase, which was learned by R2 and shared to Rl (third row). In the 
fifth row, Rl learns about a water bottle but it confuses with the soda bottle learned 
and shared before by R2. As can be seen from Figure [3 both objects have the same 
shape and consequently the PCA features are not able to discriminate between these 
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Table 1. Probability vectors PCA (vp) and SIFT (v]g) obtained by Rl 





New (collective concepts Rl) 


PCA 


SIFT 


Objects 


Dol- 
phin 


Vase 


Can 


Soda 
bot- 
tle 


Water 
bottle 


Bot- 
tle 


Co- 
ne 


Dol- 
phin 


Vase 


Can 


Soda 
bot- 
tle 


Water 
bottle 


Bot- 
tle 


Co- 
ne 


Dol- 
phin ri 






























VaseR2 


0.19 


- 


- 


- 


- 


- 


- 


0.09 


- 


- 


- 


- 


- 


- 


CanRi 


0.31 


0.26 


- 


- 


- 


- 


- 


0.12 


0.12 


- 


- 


- 


- 


- 


Soda 
bottle R2 


0.36 


0.28 


0.58 


- 


- 


- 


- 


0.28 


0.11 


0.40 


- 


- 


- 


- 


Water 
bottle ri 


0.43 


0.28 


0.53 


0.73 


- 


- 


- 


0.15 


0.59 


0.20 


0.20 


- 


- 


- 


Bottle R2 


0.31 


0.17 


0,56 


0.61 


0.58 


- 


- 


0.08 


0.15 


0.65 


0.04 


0.12 


- 


- 


VaseRi 


0.25 


0.69 


0.42 


0.43 


0.41 


0.32 


- 


0.16 


1.00 


0.23 


0.10 


0.08 


0.09 


- 


ConeR2 


0.31 


0.01 


0.28 


0.28 


0.33 


0.43 


- 


0.05 


0.28 


0.43 


0.10 


0.14 


0.09 


- 



two objects. This is not the case for the SIFT features, which prevent Rl to consider it 
as the same object. In the seventh row, Rl learns about vase which was already learned 
and shared by R2, and in this case both concepts are merged. 

Robot 1 detects the three types of possible information and fuses them as mentioned 
in section l4~2l The first four learned objects (dolphin, vase, can and soda bottle) were 
detected as complementary because all values of their corresponding vectors Vp 
and Vg are less than a threshold. The next two objects (water bottle and bottle) 
were detected as confused (type 2 and type 1, respectively), because either their SIFT 
features, or their shape features are similar, according to the a threshold. Object vase 
learned by Robot 1 was detected as coincident with the learned object vase learned by 
Robot 2 and shared to Robot 1 (vp = 0.69 and v$ = 1.00). Finally object cone, 
unnoticed for Robot 1 , is detected as complementary. 

To test the performance of the individual concepts and the collective concepts ac- 
quired by each robot, the concepts were used in an object recognition task. Each robot 
followed a predefined trajectory to recognize objects in the environment. The objects 
were detected by the robot team in the following order: cone, water bottle, vase, bot- 
tle, soda bottle and dolphin. Once an object is detected, the robot (i) evaluates its class 
using the PCA (v P ) and SIFT (v z s ) probability vectors and combines both probabilities 
using a Bayesian approach, summarized in formula[6] 






x v l Si X P u 



('• 



Pi X v Si x ^ 



) + ((l-v i Pi )x(l-v i Si )x(l-P u )) 



?).«}» 



(6) 



p(PCA 



where P u is a uniform probability distribution (P u = num0b - si j, up 
projection \ Class = i), v l s = p(SIFT matching | Class = i), P % B is the Bayesian 
probability vector (p(Class = i \ PCAprojection, SI FT matching), and I is the 
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Table 2. Precision in the object recognition task using the individual and collective concepts 
acquired by each robot 





Rl 


R2 


R1-R2 


R2-R1 


PCA 


47.82 % (89.70 %) 


52.12% (98.00%) 


95.65 % 


98.93 % 


SIFT 


50.00 % (93.87 %) 


52.13% (98.00%) 


95.65 % 


98.93 % 


Bayes 


48.91% (91.83%) 


52.12% (98.00%) 


95.65 % 


100.00 % 



index of the Bayesian probability vector, where the maximum size of the probability 
vector is numObjs 1 . 

We show in Table [2] the precision of the object recognition task using the individual 
(columns labeled with Rl and R2) and collective (columns labeled with R1-R2 and R2- 
Rl) concepts. Precision is calculated as the number of well classified images from the 
total number of test images that contain an object during one experiment. The precision 
is presented in two ways, one considering the total number of objects, and the other 
one taking into a count only the number of objects used during the individual training 
(reported in parentheses). As can be seen the collective concepts produce a significantly 
better precision. 



5.2 Accuracy of the Individual and Collective Concepts 

In this section we compare the results of the individual concepts with that of collec- 
tive concepts. In each experiment, a different set of objects was used, and both robots 
learned the same set of objects. Therefore, all the shared concepts were coincident, that 
is, robots learned both individually and collectively the same number of concepts. At the 
end of each experiment the robots learned four concepts that were tested in a predefined 
sequence. 

In Table [3j we present the average percentages of accuracy using the individual and 
collective concepts in an object recognition task. The accuracy indicates the average 
percentage of well classified images during the whole set of experiments, in this case 
six, when using the PCA and SIFT features and the Bayesian approach. 

As it can be observed in Table the accuracy that indicate the quantity of well clas- 
sified images using the collective concepts for the object recognition task, is in general 
better than the accuracy using the individual concepts. For PCA, SIFT and Bayes there 
is an improvement in the accuracy of 2.56 %, 13.79 % and 20.62 %, respectively. This 
demonstrates that the collective concepts have better coverage than the individual con- 
cepts because they contain information acquired from different points of view, which 
allows a better recognition of test objects. 

In Table |U we present the average percentages of false positives for both, the indi- 
vidual and the collective concepts acquired by the robots. From the results presented 
in Tables [3] and [4] we conclude that the collective concepts have better quality than the 
individual concepts. 

In general for the individual and the collective concepts, we observed an improve- 
ment in the accuracy when using the Bayesian approach. The average profit in the per- 
centages of classification using the Bayesian approach using the collective concepts 
with regard to the individual concepts is of 14.63 % . 
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Table 3. Accuracy of individual and collective concepts 





PCA 


SIFT 


Bayes 


Individual 


84.94 % 


67.88 % 


80.18 % 


Collective 


87.18 % 


81.12% 


94.81 % 



Table 4. Average percentages of false positives of individual and collective concepts 





PCA 


SIFT 


Bayes 


Individual 


14.42 % 


0.64 % 


0.64 % 


Collective 


13.14 % 


0.00 % 


0.00 % 



6 Conclusions and Future Work 



In this paper we have introduced a new on-line learning framework for a team of robots. 
Some of the main features of the proposed scheme are: 

- The robots do not know in advance how many objects they will encounter. This 
poses several problems as the robots need to decide if a new concept or a shared 
concept, is of a previously learned object or not. 

- The representation of objects are learned on-line while the robots are traversing a 
particular environment. This is relevant for programming autonomous robots. 

- Three possible cases in which to merge concepts and how to merge them were 
identified. 

The detection of coincident concepts avoids producing multiple concepts for the same 
object. The detection of complementary concepts allows to detect and learn unknown 
objects not seen by a particular robot. The detection of confused concepts allows to fuse 
information: (i) when the objects have different shape and similar SIFT features, and 
(ii) when the objects have similar shape and different SIFT features. These cases are 
particularly difficult to deal with because the objects may be genuinely different or may 
be the same but seen from different points of view by the robots. 

In general, the object recognition using the collective concepts had a better perfor- 
mance than using the individual concepts in terms of accuracy. This occurs because the 
collective concepts consider information from multiple points of view producing more 
general concepts. 

As future work we propose to integrate schemes to object segmentation for dynamic 
environments. For instance, using an object segmentation based on distance as in ifTTl . 
Use a different set of features and identify possible conflicts between more that two 
kinds of features. We also plan to incorporate planning of trajectories to autonomously 
allocate the environment among robots. We also plan to add strategies to solve some 
confusions in shared concepts by taking different views from these objects. Finally, we 
plan to incorporate our algorithm for robot localization and search of objects, and to 
test our work for robot teams with three or more robots. 
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Abstract. This paper concerns the detection and the tracking of moving objects 
from images acquired by a single camera Embedded on a mobile robot. The pro- 
posed approach is based on the active extraction of salient features, feature track- 
ing on some images, feature clustering as a way to detect moving objects. 

Keywords: Moving obstacles, Detection, Tracking, Clustering, Monocular 



1 Introduction 

A key function required for autonomous robot navigation, must cope with the detection 
of objects close to the robot trajectory, and the estimation of their states. This func- 
tion has been studied by the robotic and the ITS (Intelligent Transportation Systems) 
communities, from different sensory data such as laser, radar, vision, among others. 
For driver assistance, many contributions concern laser-based obstacle detection and 
tracking [15]. In order to track obstacles, it is required to estimate from the same sen- 
sory data, the egomotion of the vehicle. The egomotion could be obtained from Inertial 
Measurement Units (IMU) or estimated by using the images and the epipolar geome- 
try (7). First solution needs that IMU be coupled rigidly to the camera which is normally 
not reached. In other hand, egomotion image-based is inaccurate in the case of noisy, 
low contrasted images. The disadvantages of both approaches are tackled in [9], where 
authors present an alternative camera-IMU system to acquire the images and accurately 
measure actual camera motion. However, this solution has low portability. So, in spite 
of numerous contributions, object tracking during robot navigation still remains a chal- 
lenge when it is based only on vision. So this paper proposes a strategy to detect static 
points, and moreover to detect and cluster the moving ones in order to track mobile 
objects: it is the first step towards the full integration of a Visual SLAMMOT approach. 
It is proposed to reach this objective, using only a monocamera system. 

There are various methods for feature detection and tracking using a single mobile 
camera. A.J.Davison (4) has proposed a spatio-temporal approach, in order to detect 
and track 2D points from an image sequence and then to reconstruct corresponding 
3D points used to locate the camera. The system state at instant time t, is given by 
the 3D position and the speed of the camera and every tracked point selected by the 
Harris detector. From this state, every point position is predicted in the image acquired 
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at time t + 1, forming an elliptic zone where the point must be found if the global 
state is consistent. Then, each interest point is searched in its predicted zone by a 
similarity measurement based on correlation score using a n x n template around its 
first position. 

A method widely used for robotics applications is based on the optical flow. If optical 
flow is only extracted for interest features, i.e. for a very small part of total image points, 
a tracking procedure of many objects characterized by some points, could be applied 
in real time. Such a sparse solution has been proposed by Shi-Tomasi 1 13] and it is 
commonly used in computer vision because of its simplicity and its low computational 
cost. Our own method is based also on this method as a valid and confirmed procedure, 
that can be applied in a real time context during navigation. Among others, let us cite 
the work presented in [ 10 ] where authors have used the optical flow field to leverage the 
difference in appearance between objects at close range and the same objects at more 
distant locations. This information allows them to interpret monocular video streams 
during off-road autonomous navigation and to propose an adaptive road following in 
unstructured environments. This method has been evaluated for the navigation of an 
intelligent vehicle in a desertic terrain. 

However, once interest points and optical flow are extracted and tracked from an im- 
age sequence, it is so important to distinguish which of those tracked points represent 
moving objects. Clustering techniques are the first basic solution to this question, but 
unfortunately most of them require initial information about the scene as the number 
of clusters to find. The success of these methods highly depends on these parameters. 
T. Veit et. al 1 14] coped with the same issue for the analysis of short video sequences. 
They validate a clustering algorithm based on the a contrario method 1 6 ] which does 
not need parameter tuning or initial scene information for finding clusters of mobile 
features. This approach has been also used in [12] to detect moving objects in short 
sequences; additionally, authors obtain 3D components of feature points to better de- 
tect the correspondence between points and moving objects on which they have been 
extracted. This work presents experimental results on real images, acquired from fixed 
cameras, so that essential issues of autonomous navigation are not considered. 

This paper proposes moving object detection and tracking on a robot navigation 
context based on KLT tracker and the a contrario clustering. Robot motion is obtained 
from the Inertial Measurement Units then compensated in order to obtain real velocity 
of the interest points. The resulted clusters of dynamic points are delimited by a B- 
spline contour then initialized as moving objects and tracked by a Kalman Filter. At 
each iteration, image locations are ponderated by a bi-variate pdf function when a point 
is tracked. This value represents the probability that the location contains a dynamical 
point, being the most interesting location where to find an interest point. 

Next section explains the proposed strategy. KLT procedure used to detect and track 
interest points is briefly described in section 3. Section 4 describes the main concepts 
of the a contrario theory. The global detection-clustering-tracking approach and the 
use of the occupancy map in a long sequence of images are presented in section 5. 
Experimental results on real images are presented and discussed in section 6. Finally 
last section concludes and explains future works. 
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Fig. 1. Algorithm to detect and track multiple objects 



2 Overall Detect, Cluster and Track Strategy 

General block diagram in Fig. Q] describes the procedure carried out to detect and track 
multiple moving objects. N initial feature points are selected in the input image(t) using 
the Shi-Tomasi approach. Feature locations in next image are searched by the KLT 
tracker, based on correlation and optimization processes. Let us suppose first that no 
object is tracked; so the process loops on N im successive images executing three task, 
feature tracking, ego-motion computation and update of the occupancy map. We will 
call time of trail, this set of N im processed images, i.e. the number of images used to 
accumulate positions and apparent velocities of tracked features. In order to select new 
features, we seek into the occupancy map the most interesting zones in the image for 
searching new points. KLT process is executed continuously in this way, while the robot 
navigates in order to provide new visual information at each time of trail. 

At the end of each time of trail, only moving KLT features are selected for being 
grouped by the a contrario clustering method. Once robot ego-motion is compensated, 
moving KLT features are characterized by a velocity vector of the higher than a thresh- 
old set to 1 pixel. Every resulted cluster is directly initialized as a moving object using 
the most external points in the cluster to calculate a contour. Then dynamic objects are 
tracked with a position and an apparent velocity estimated by a Kalman Filter. At every 
iteration, this object could be merged with another previous detected object based on 
similar velocity and close position. Implementation details of both clustering and merg- 
ing process are presented in section [4] Finally, object current positions are stored in an 
occupancy map being the highest values of probability the zones of the image which 
are not interesting to look for new features. 

3 Feature Selection and Tracking 



Optical flow procedure used in this work is based on the initial technique proposed 
in ifTTIl and on the well-known Select Good Features to Track algorithm |[T3l . This 
technique is largely used in the robotics community because it proposes to match the 
more salient points, minimizing the processing time. This is a sparse method since only 
few points are initially selected to describe image content. 
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Fig. 2. Accumulated Optical flow in 5 successive images from the sequence of the Fig. [9] 

Detection of Moving Points. N distinctive feature points are initially extracted from 
image to by the analysis of spatial image gradient in two orthogonal directions. Loca- 
tions of these N points in next image are obtained by maximizing a correlation measure- 
ment over a small window. Iterative process is accelerated by constructing a pyramid 
with scaled versions of the input image. Furthermore, rotation, scaling and shearing are 
applied on each correlation window by optimizing a linear spatial transformation pa- 
rameters during iterative process. Once displacement vectors are obtained for all initial 
features, apparent velocities are estimated based on displacement vectors. 

Fig. [2] (frames 54 to 58 extracted from the sequence shown in the Fig [9j) depicts 
TV = 150 initial feature points detected and tracked during a time of trail. From these 
accumulated locations, blue points represent points with long optical flow displace- 
ments. For these points the assumption that they belong to a moving object has a high 
probability. 

Perception on Moving Objects. By analyzing optical flow extracted from images ac- 
quired from a moving camera in dynamic environments, we deduce that larger is the 
time of trail, better will be the perception of moving objects, but also of the camera 
motion. In order to minimize the ego-motion impact, we propose to reduce this time of 
trail to 5 images which covers up ego-motion and also, points out independent move- 
ments. This can be verified in Fig. [2 a constant movement in the cloud of points can 
be seen in the middle of the image and the ego-motion is less remarkable. Furthermore, 
the best advantage of this choice N irn = 5 is the reduction of the waiting time before 
performing the a contrario clustering task. This strategy works well essentially when 
the robot velocity is slow so with the aim of reaching higher velocities during naviga- 
tion we propose to compensate robot ego-motion using an Inertial Measurement Units 
mounted on the robot. 

Once any moving object is detected, the number of its dynamic points is subtracted 
from the N points tracked permanently by KLT. Thus in following iterations, KLT 
will select less than TV new points. This strategy allows to track only a fix number of 
N points between KLT and the object tracking process. This rigorous control on the 
number of points is important in our methodology because long image sequences will 
be evaluated and the performance directly depends on the number of processed points. 
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4 A Contrario Method and Merging Process 

Visual perception is a complex function that requires the presence of "salient" or "ev- 
ident" patterns to identify something that "breaks" the continuous motion due to the 
camera ego-motion. This "salient pattern" corresponds to "meaningful event" detected 
by the a contrario method |3 ]. Basic concepts of the a contrario clustering inspired by 
the Gestalt theory, are exposed in [5] and deeply in [6|. In general, Gestalt theory es- 
tablishes that groups could be formed based on one or several common characteristics 
of their elements. In accord to this statement, an a contrario clustering technique (pro- 
posed by Veit, et. al. 1 14 1) identifies one group as meaningful if all their elements show a 
different distribution than an established background random model. Contrary to most 
of clustering techniques, neither initial number of clusters is required nor parameter 
has to be tuned. In our context of unknown environment, these characteristics are very 
favorable. 

4.1 Evaluation of a Background Model 

We use the background model proposed in [14] which establishes a random organiza- 
tion of the observations. Hence, background model elements are independent identically 
distributed (iid) and follow a distribution p. The iid nature of random model components 
propose an organization with not coherent motion present. 

Next, given an input vector V(x, y, v, 9) in R 4 , first objective is to evaluate which 
elements in V show a particular distribution contrary to the established distribution p 
of the background model (that explains "a contrario" name). To avoid element by ele- 
ment evaluation, first, a binary tree with V elements is constructed using a single linkage 
method. Each node in the tree represents a candidate group G that will be evaluate in a set 
of given regions designed by H. This set of regions is formed by different size of hyper- 
rectangles that will be used to test the distribution of several data groups. Each region H 
is centered at each element X in the group until finding the region Hx that contains all 
the group and at the same time makes minimal the probability of the background model 
distribution. Different size of hyper-rectangles are used in function of data range, in our 
experiments we use 20 different sizes by dimension. Final measure of meaningfulness 
(called Number of False Alarms NFA in referenced work) is given by eq[T] 

NFA(G) = N 2 -\H\ min B (N - l,n - l,p (H x )) (1) 

x e g, 



In this equation N represents the number of elements in vector V, \H\is the carnality of 
regions and n is the elements in group test G. The term which appears in the minimum 
function is the accumulated binomial law, this represents the probability that at least 
n points including X are inside the region test centered in X (Hx)- Distribution p 
consists of four independent distributions, one for each dimension data. Point positions 
and velocity orientation follow a uniform distribution because object moving position 
and direction are arbitrary. In other hand, velocity magnitude distribution is obtained 
directly of the empirically histogram of the observed data. So that, joint distribution 
p will be the product of this four distributions. A group G is said to be meaningful if 
NFA(G)<1. 
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Fig. 3. False dynamic object detected on the ground for higher motion velocity 

Furthermore two sibling meaningful groups in the binary tree could belong to the 
same moving object, then a second evaluation for all the meaningful groups is calcu- 
lated by Eq. [2]To obtain this new measure, we reuse region group information (dimen- 
sions and probability) and just a new region that contains both test groups G\ and G 2 
is calculated. New terms are N f = N — 2, number of elements in G\ and G 2 , respec- 
tively n' 2 = ni — 1 and n' 2 = n 2 — 1, and term T which represents the accumulated 
trinomial law. 

NFA G (G u G 2 )=N 4 .\H\ 2 T(N , ,n , 1 ,n 2 , PuP2 ) (2) 

Both mesures[T]and[2]represent the significance of groups in binary tree. Final clusters 
are found by exploring all the binary tree and comparing if it is more significant to 
have two moving objects G\ and G 2 or to fusion it in a group G. Mathematically, 
NFA{G) < NFAg{G u G 2 ) where G t U G 2 C G. 

4.2 Filtering False Dynamic Points by Estimated Camera Motion 

This method has been first evaluated on numerous image sequences. Even if the method 
gave good results for slow camera motions Q, it was noted that false dynamic objects 
could appear for fast camera motions (especially rotations). For instance, on figure [3 
a static region is detected as mobile on the ground. How to filter such bad detections? 
To overcome this problem it is proposed an approach based on the estimated camera 
motion, provided by inertial sensors fixed on the camera. For this strategy flat ground 
assumption is considered, so the apparent motions of points located on the ground could 
be compensated from their estimated motion calculated by projection geometry and 
IMUdata. 

First, IMU delivers the information of the 6 rigid motion parameters: 3 coordinates 
for translation and 3 angles for rotation [X, Y, Z, ip, a, \P]. We assume that robot moves 
in the XwYw plane of the World reference frame denoted by W on Fig. [4] Note that 
the difference between World and Robot reference frames is only the vector translation. 
Important rotations in cp and \P angles are present between Camera and Image refer- 
ence frames. Let us note Hcr the fixed transform between the Camera and the Robot 
reference frame, K the intrinsic matrix of the camera, and M t -N t the estimated robot 
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Hci 




Fig. 4. Coordinate systems for the World (W), the Robot (R), the Camera (C) and the Image (I) 

motion between images acquired at time t — N and t. So the camera model is expressed 
at time t — N in the robot frame by equation [3j 



[su t - N , sv t -N, s] T = KH C r [X t -N, Yt-N, Zt-N^\ T 



(3) 



Then the 3D point can be reconstructed if this point belongs to the ground plane 
(Zt-N — 0) or to the infinity plane (X t -N = oo, e.g. 100m). So if this point is static, 
its projection on the image acquired at time t can be predicted using the equation |4j 



\SUt,SVt,S\ 



KHcrM, 



[X t -N, Y t -N, Z t -N, 1] 



t-N,t l-^t-N 



(4) 



in the two last equations s represents the scale. Then the tracked feature is considered 
as static, so not considered for the clustering process, if \/\{u t — u t ) 2 + (v t — v t ) 2 ) is 
less than one pixel. 



4.3 Merging Groups 

This function is executed only if moving objects have already been detected. O is a 
set of M objects that will contain all candidate objects for merging evaluation. That is, 
O = Ot U Oc where Ot consists of(l,2,...,fc) moving objects tracked by Kalman 
filter, and Oc consist of (1,2, ...,/) new moving clusters, interpreted either as new 
moving objects, or part of existing ones . For each object in O, the velocity vector is 
modeled by the mean of their velocity components in X and Y, respectively represented 
by fax and [i v y • We use these models to evaluate eq|5]that let establish a decision 
constraint for merging. 



mm 

i,j e m, 

i # 3, 
Oi ,Oj CO 



sd^vxiO^^vxiOj)) 

s(fJL v Y(Oi),fJL v Y(Oj)) 



< 



d v x 
d V Y 



(5) 



We evaluate the similarity measure s which performs the subtraction among velocity 
models for each object in O. Parameters d v x and d V Y are constant values set to one 
pixel in accord with the previous established threshold for detecting moving features 
in KLT process. This value is chosen in order to conserve the best trade off between 
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the threshold of moving points in KLT module and the expected bias among object 
velocities in the scene. This evaluation is carried out in a linked way, where merged 
groups are removed from O and added as a new object at the end of the list with, 
obviously, a new corresponding velocity model. This strategy allows the merging of 
the same objects previously detected with a more enriched model. See for example the 
image 7a in which the gray car that enters in the scene is detected as two different object, 
then several images later in image 7b the car is totally detected. From that case, even 
when both Ot and Oc object velocities are not resulted from the same process, they 
could be compared because both are based on pixel displacement in the scene (their 
optical flow). 

5 Moving Objects Tracked by a Kalman Filter 

Groups found by clustering technique are composed of point locations at different pro- 
cessing time. Hence, to confirm that moving object is still in the scene, only points 
present in last processed image are taken to initialize each cluster as a moving object in 
Oc- Therefore only the points located on the current image will be used to model the 
moving object. 

To track moving objects in Ot along the image sequence, Kalman filter prediction 
evaluates a constant velocity model. To initialize this model a vector state is defined 
for each moving object detected. Vector state consists of the bary center of object in 
X and Y and its velocities are set to \i v x and \i v y values, respectively. We assign 
the estimated position and velocity calculated by Kalman filter to baricenter of the ob- 
ject. This estimated position is used as the center of a window that will be extracted 
in the next image in order to search the object points. We called this window the 
zone of object and its size is a function of previous object limits and a security mar- 
gin. Once object region is extracted in next image, we carried out a correlation process 
to find new object location. 

5.1 Border Object Initialization and Tracking 

The concept of object developed here covers the management of several points which 
have been evaluated as a part of a moving object. Rigid and non-rigid moving objects 
could be detected in the scene. As no general size and form of the objects is established, 
we calculated an initial contour with the dynamic points that forms the cluster. Thus, 
initial object model is defined by a linear combination of several B- spline functions and 
a set control points |2|. 

From the set of dynamic points in the cluster, we choose only the farthest points 
of the center, i.e. points in the perimeter, to define different B- spline functions of the 
contour. Figure 5a shows in blue the moving points detected on a dynamic rigid object, 
then in magenta the points found to describe the object border. An initial contour for a 
non-rigid object could be also defined by a B-spline function, this is shown in Fig. 5tQ. 



1 Image sequence downloaded from |8 1 courtesy of EC Funded CAVIAR project/IST 2001 
37540. 
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(a) 



(b) 



Fig. 5. Object contour initialization, for a rigid moving object, image (a), and non-rigid moving 
object, image (b) 

Contour evolution until object real limits is still in development, the main idea is to 
minimize the internal and external energy based on the intensity or in the color of 
selected object region. So, up to now the contour is updated in according to the new 
positions of its control points at each image, found by correlation as the rest of the 
points in the model object. 

Feature Point-based Model. In this work, the task of tracking consists in following 
each dynamic point in the object with its particular appearance based only on their 
velocity and position. Thus, model object initialization consists in extracting a window 
patch (a template) around each point in the image where object was detected. The same 
number of templates are extracted around the estimated feature location in the next 
image. Appearance of initial templates in the current image is updated by an affine 
model. Feature points could be removed from the model if one or more of the following 
cases happen: 

- Feature location is not found by correlation 

- Location found is not inside of the object contour 

- Displacement and velocity of the points found are not inside of the normally 
distribution of their respective mean data. 



5.2 Occupancy Map 

In our algorithm, it is important to add new points that complete the model of the de- 
tected moving object and at the same time detect new incoming objects. In a context of 
unknown environment, feature points should be initially chosen in all the image without 
highlighting some locations. To overcome this uniform point selection along robot nav- 
igation process, a space-time occupancy map is constructed by cells centered on tracked 
KLT points indicating occupied locations. This map develops the idea of an occupation 
grid in the sense of higher probabilities represent the locations in the image that are 
no important to seek for a new point, like occupied locations (see Fig. 6a). So, mainly 
locations in the image with lower probability values in the map will be first used to look 
for new points. 
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(a) 



(b) 



Fig. 6. Image (a) shows the initial state of the occupancy map for an image of 640 x 480 pix- 
els with 150 cells (in white) centered on the feature points. Image (b) shows simulated results 
of the occupancy map for a tracked point in 5 successive images. Inverse shapes represent the 
probability in the four previous images, higher shape depicts current point location. 



First the probability for all map locations are initialized to po = 0.5 that rep- 
resents the initial fair selection of feature points, that is, p(u,v,t = 0) = po- 
When an interesting point is detected in (u,v) at t > 0, it becomes the cen- 
ter of an occupied cell. The probability for occupied cells is given by eqs. [6] 
and [7] where Q(u,v\n u , n v ,cr u ,cr v ) is a 2D Gaussian pdf with parameters fi u — 
u, fi v = v, a u = a v = |, being r the cell size. A cell becomes empty 
in function of interest point displacements at every iteration: if a tracked point 
leaves the cell it was using at time t — 1 its probability is updated and then 
this cell is labeled as empty in the current image t. The new cell in which 
tracked point is located at time t becomes now an occupied cell. The Fig. 6b de- 
picts simulated results of this map construction along five successive images for 
a tracked point. A square cell of size 11 x 11 centered on (u,v) point location 
is used. The inverse effect of function a(u, v) applied to previous point positions 
is clearly seen in this image. With this probability map we enhance the assump- 
tion of new points of incoming object will appear behind its current detected point 
positions. 



p(u, v) t = a(u, v) t + G(u, v\fjb u , fi v ,a u , a v ) 



(6) 



where a(u,v) function describes the previous probability in the cell according to the 
previous state of location (u, v), that is: 



a(u,v) t = < 1 



Po si Cell(^, v) = occupied 

■p{u,v) t -i si Cell(/u, v) = empty 



(7) 



To avoid that this map stores for long time the same probability values in some 
locations, the map is reset each 2 times of trail process, except for the current object 
tracking locations. 
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6 Experimental Results and Discussions 

Proposed strategy have been implemented in C, C++ and TCL and included as a mod- 
ule into a framework for developing algorithms in robotics. Robot navigation was per- 
formed in indoor and outdoor context with a camera mounted on the robot (640 x 480 
at 10Hz). The number TV of tracked points by KLT is set to 150. Robot odometry 
is obtained from a file for each image time, this information let to increase the robot 
velocity. 

The Fig. [7] shows the algorithm performance with object occlusions. In this case the 
car which is closer to the robot is well detected and tracked. Some problems to perfectly 
detect the second car after the occlusion occur and image 7c shows that the algorithm 
divides it in 2 different moving objects. 




(a) Image 106 



(b) Image 136 



(c) Image 176 



Fig. 7. Algorithm results for occlusion objects. Gray car is detected by two different regions then 
it is totally detected several images later. A second car that crosses the scene is detected before 
and after its total occlusion. 



IMU Robot-position in W 





(a) 



(b) 



Fig. 8. IMU information obtained from image sequence of Fig. [9] Image (a) shows robot trajec- 
tory performed on the XYZ World reference frame. Image (b) shows information about the three 
rotation angles (yaw, pitch, roll) . 



Second test was carried out on a sequence of 1900 images. Fig. [8] shows robot po- 
sition in the world frame and its the yaw, pitch and roll angles given by the odometry 
measures. Note that rotation angles show negligible change values until image 400, so 
we have selected 6 representatives images from 600 image, in order to illustrated more 
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(a) Image 644 



(b) Image 665 



(c) Image 720 




(d) Image 1121 



(e) Image 1144 



(f) Image 1197 



Fig. 9. Experimental results in outdoor environment. First row shows moving object detection 
and tracking for a car that moves in the same sens of the robot, furthermore it does not cross the 
field of view of the robot. On the second row: moving object detection before the occlusion is 
successfully handle by our proposed method. 



real and common robot motions like rotations and translations at the same time. Unlike 
to results shown in Fig. [TJ the first row of Fig. [9] shows the detection and tracking of a 
car that not crosses the field of vision of the robot (fov). The car is initially detected on 
front right tire level, after when this part is hidden, the rear of the car is detected and 
tracked until its motion is not more perceived on the horizon. At the same time robot 
starts to turn left (see significant changes in the yaw angle) so this shows successfully 
detection results under real challenge situations. In the second row, on image 9d the car 
is moving towards the robot but it continues to appear only on the left side of the fov 
of the robot. After, the image 9e shows the detection of both cars before the occlusion. 
Finally, the dark color car is detected and tracked with success until it is out of the/bv. 
An extension of tests is carried out for evaluate the development of our algorithm 
during faster robot displacements. The Fig. [TO] shows some images from a sequence 
acquired during robot navigation around the laboratory. The velocity of the robot is 
around 2 m/s, unlike the last two resulted sequences that were acquired at around 1 
m/s. Middle image on this figure shows the successful detection of a truck that moves 
towards the robot. However, a noticeable "dynamic" object is detected on the right 
side of the image. We have been interested in showing this image because that proves 
the ideal assumptions that we have established. That is, even when our ego-motion 
strategy (see section 14.21) has been proposed to filter out moving objects caused by 
the robot motion, this object is detected because it is located on the sidewalk, that is 
not at the same level of the robot XY plane. Thus, to overcome this restriction in our 
algorithm, a strategy that involves mobile 3D points projection of the KLT features 
will be carried out. Thus, camera motions could be corrected only a posteriori, using 
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(a) Image 159 



(b) Image 169 



(c) Image 189 



Fig. 10. Outdoor experimental results at higher velocity. Two dynamic objects are detected: a 
false moving object on the ground and a real moving truck (see the text for a more detailed 
explanation). 

a SLAMMOT module that reconstructs these 3D points, considered either as static or 
mobile. A Maximum Likelihood test will be then applied in order to select the more 
probable between these two hypothesis. 

7 Conclusions 

Experimental results show that even with few images, it is possible to detect rigid mov- 
ing objects by a spatio-temporal analysis of features. Object model is enlarged thanks 
to prior knowledge managed by the proposed occupancy map. This map is success- 
fully used during the active search of feature points because it mainly highlights zones 
that certainly contain new moving interesting points. Our tests are performed offline on 
recorded sequences; however, the global algorithm works fast and could process images 
at 10Hz. The clustering method is the highest time consuming in the global process; for 
that reason, the number of points to be grouped by the clustering method, should be no 
more to 150 points. Thus, the trade-off between image size and that number of points 
guarantees the highest performance in overall strategy. This work is being extended to 
change the current feature point-based model for an active region-model in which func- 
tions of higher level as object identification and classification could be performed. In 
the general case, it is not possible to remove camera motions from the apparent motions 
of tracked features. So a global strategy to avoid ego-motion detection and non rigid 
moving objects is being integrated based on monocamera SLAM approach. An inter- 
change of 3D and 2D points information between SLAM and our MOT process will be 
continuously carried out giving a cooperative sense to our new proposed strategy. 
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Abstract. Linearizing control is a popular approach to control bioprocesses, 
which has received considerable attention is the past several years. This con- 
trol approach is however quite sensitive to modeling uncertainties, thus requiring 
some on-line parametric adaptation so as to ensure performance. In this study, 
this usual adaptive strategy is compared in terms of implementation and perfor- 
mance to a robust strategy, where the controller has a fixed parametrization which 
is determined using the LMI framework so as to ensure robust stability and per- 
formance. Fed-batch cultures of yeast and bacteria are considered as application 
examples. 

Keywords: Nonlinear robust control, Adaptive Control, Fermentation Process, 
Biotechnology. 



1 Introduction 

1.1 Context and Motivation 

The culture of host recombinant microorganisms is nowadays a very important way 
of producing biopharmaceuticals. Fed-batch operation is popular in industrial practice 
since it is advantageous from an operational point of view to feed the cells (also called 
biomass) progressively as they grow, instead of overfeeding the bioreactor, which can 
inhibit the cell growth. The off-line determination of the feeding profile is usually sub- 
optimal as some security margin has to be provided in order to avoid this possible excess 
of substrate leading to the accumulation of inhibitory byproducts (inhibition of the cell 
respiratory capacity), namely ethanol for yeast cultures and acetate for bacteria cul- 
tures. To optimize the culture conditions and to avoid high concentrations of inhibitory 
byproducts, a closed-loop solution is required, and a wide diversity of approaches, e.g., 
03, El, 11, G3, E2, (3, EH, have been considered. 
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As discussed in (4), several control strategies can be applied to define an optimal 
feeding profile (usually considered as the sole manipulated variable in most of the pub- 
lished studies). In order to maintain the system in optimal conditions, it is possible to 
control the RQ (respiratory coefficient) of the cells using gas analyzer measurements, to 
keep the system at a very low critical substrate concentration, or to control the byprod- 
uct concentration at a sufficiently low level. 

For example, in [18] and |8], two extremum- seeking strategies regulating the sub- 
strate level around the optimal critical value are investigated. This kind of approach can 
however only be applied when this critical value is measurable, which is generally dif- 
ficult for yeasts and bacteria cultures considering the sensitivity of the probes currently 
available on the market. 

In Q, a classical PID with gain scheduling is used to control the byproduct concen- 
tration. However, this investigation shows that a PID controller is unable to track an 
exponential trajectory as small tracking errors evolves exponentially. 

In lfT2l and Q, a robust linear controller is applied to regulate the byproduct concen- 
tration. The main advantage of this controller is to require minimum a priori knowledge, 
i.e. one yield coefficient and the measurement of the byproduct concentration. 

Among the published studies, linearizing control |2| is a very popular approach, 
which has been applied succesfully in a number of case studies (see [11], [4], [13]). 
However, linearizing control is by essence model-based and, therefore, requires the 
knowledge of an accurate model. As bioprocess models are generally uncertain, adap- 
tation is required using observer-based or least- squares strategies. Whereas parametric 
adaptation is a simple approach, it does not guarantee stability in the presence of un- 
modeled dynamics or too high noise levels. 

In this paper, another approach is considered, which is based on nonlinear robust con- 
trol and the use of Linear Matrix Inequalities (LMIs) to design the free linear dynamics 
so as to ensure robust stability and performance in presence of model uncertainties. A 
comparison of the adpative and robust control approaches is provided in terms of imple- 
mentation, and simulation tests show the respective advantages and limitations of both 
strategies. 

2 Model and Control Objectives 

2.1 Model Description 

In this section, we first consider a generic mechanistic model that would, in princi- 
ple, allow the representation of the culture of different strains presenting an overflow 
metabolism (yeasts, bacteria, animal cells, etc). 

This model describes therefore the cell catabolism through the following three main 
reactions: 

Substrate oxidation : ksiS + koiO ^ kxiX + kciC (la) 

Substrate fermentation : ks2$ + &02 O ^-> kx2 X + kp2 P + kc2 C (lb) 
Byproduct oxidation : kpsP + k<j3 O r ^ kx3 X + kc3 C (lc) 
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where X, S, P, O and C are, respectively, the concentration in the culture medium of 
biomass, substrate (typically glucose), byproduct (i.e. ethanol or methanol in yeast cul- 
tures, acetate in bacteria cultures or lactate in animal cells cultures), dissolved oxygen 
and carbon dioxide. k& (i = 1, 2, 3, £ = X, S, P, O, C) are the yield coefficients and ri, 
V2 and r3 are the nonlinear specific consumption rates given by: 

n = EgfrJgiTggg) (2a) 



ksi 

ks2 



r 2 = ™^rs-r Scr J (2b) 



max 



y ">op 



rs) 



rs = ; (2c) 

kp3 

Note that these specific consumption rates are divided, for each reaction, by the corre- 
sponding substrate yield coefficient (ksi and ks2 for the main substrate, usually glu- 
cose, in the first two reactions and kps for the substitute carbon source, the byproduct, 
in the third reaction) in order to normalize the consumption mechanism with respect to 
the substrate source. For instance, the substrate consumption rate of the first reaction is 
ksiri which is equal to rs or rs crit if the oxidative capacity is completely exploited, 
while the corresponding biomass growth rate is fcxiTi which is respectively equal to 
ir^rs or t^ts + . The yield coefficient ratio t 2 ^ illustrates this normalization of the 

KSI ° KSI D crtt J k S i 

growth rate with respect to the substrate consumption. The kinetic terms associated with 
the substrate consumption rs and the critical substrate consumption rs crit (function of 
the cells oxidative or respiratory capacity ro) are given by: 

rs = fis sTKS (3a) 

= ro_ = Mo O K iP 

VScrxt ~ k os k os + K o K iP + P { ) 

These expressions take the classical form of Monod laws where us and fio are the 
maximal values of specific growth rates, Ks and Ko are the saturation constants of 
the corresponding element, and K iP is the inhibition constant. k os and k op represent 
the coefficients characterizing respectively the yield between the oxygen and substrate 
consumptions, and the yield between the byproduct and oxygen consumptions. In order 
to illustrate the role of k os and k op , consider for instance the oxygen consumed in the 
first two reactions ([Tat and dTbb . As shown by Fig.Q] a certain substrate quantity equal 
to ksiri is oxidized using an equivalent oxygen quantity fcoiToi = ksiri where roi 
can be seen as the oxygen consumption rate in the first reaction. In a similar way, the 
equivalent substrate and oxygen quantities required by the second reaction are equal and 
respectively defined as ks2^2 and ko2^02 where ro2 is the oxygen consumption rate in 
the second reaction. In order to link rs to ro, a global yield coefficient k os is defined as 
ro = k os rs- The introduction of k op in the model follows therefore the same reasoning 
for the byproduct. Nevertheless, note that for particular cells which do not need oxygen 
in ([TbK k os is sometimes summarized to koi and, for analogous reasons, k op to kos- 
As our aim is to provide a general representation of overflow metabolism, this notation 
is used in the following sections. 
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^02 r 02 ^S2 r 2 



Substrate 



^o\ r o\ ^s\ r \ 



Product 




Fig. 1. Schematic representation of the simplified kinetic model using k os as a global yield coef- 
ficient between the substrate and the oxygen consumptions 



( r O- k os r sV k < 



s > S crit 




Respirative regime Respiro-fermentative regime 

Fig. 2. Illustration of Sonnleitner's bottleneck assumption (|16|) for cells limited respiratory 
capacity 



Kinetic model ® is based on Sonnleitner's bottleneck assumption ( lfT6l ) which was 
applied to a yeast strain Saccharomyces cerevisiae (Figure[2]). This model explains that 
during a culture, the cells are likely to change their metabolism because of their limited 
oxidative capacity. When the substrate is in excess (concentration S > S cr it and the 
glucose consumption rate rs > fs crit ), the cells produce a byproduct P through the 
fermentative pathway, and the culture is said in (respiro-) fermentative (RF) regime. 



dt 


= (kxiri + k X 2T2 + k x ?>rz)X - 


dS 
dt 


= -(ksir 1 + k S 2r2)X + DS in - 


dP 

~dt 


= {k P2 r 2 - kp 3 r 3 )X - DP 


dO 

~dt 


= -(k m + k 2T2 + k 03 r 3 )X 


dC 
~dt 


= (kciri + k C 2T2 + k C3 r 3 )X - 


dV 
~dt 


— F 

— -L in 
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On the other hand, when the substrate becomes limiting (concentration S < S cr i t and 
the glucose consumption rate rs < rs crU ), the available substrate (typically glucose), 
and possibly the byproduct P (as a substitute carbon source), if present in the culture 
medium, are oxidized. The culture is then said in respirative (R) regime. 
Component- wise mass balances give the following differential equations: 

DX (4a) 

- DS (4b) 

(4c) 

- DO + OTR (4d) 
DC- CTR (4e) 

(4f) 

where Si n is the substrate concentration in the feed, F in is the inlet feed rate, V is 
the culture medium volume and D is the dilution rate (D = F in /V). OTR and CTR 
represent respectively the oxygen transfer rate from the gas phase to the liquid phase 
and the carbon transfer rate from the liquid phase to the gas phase. Classical models of 
OTR and CTR are given by: 

OTR = k L a(O sat - O) (5a) 

CTR = k L a(C-C sat ) (5b) 

where k^a is the volumetric transfer coefficient and, O sat and C sat are respectively the 
dissolved oxygen and carbon dioxide concentrations at saturation. 

The optimal operating conditions that maximize the biomass productivity are at the 
boundary of the two regimes. In these conditions, the fermentation and byproduct oxi- 
dation rates are equal to zero and, from 0: 

(6a) 
(6b) 

where 

(7a) 

(7b) 

(7c) 
v -f j\o J^iP i- r- 

the following relations hold: 

(8a) 
(8b) 



n = 


= min(r s ,r Scr 


J 


T2 '- 


= max(0, rs — 


S cr it ) 


&crit 


_ r o 

s 




rs 


~ S + Ks 






O 


K iP 


I: 


~^°0 + K o K iP + P 


n 


= r s = r Sc „ t 
r 2 =0 


- 12. 
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Expression (TTab shows that the respiratory capacity has an influence on the critical 
substrate concentration level. For illustration purposes, FigU shows a simulation of a 
fed-batch yeast culture where the substrate concentration in the culture medium is reg- 
ulated around a constant theoretical value S sp = 0.0226g/l. This constant value is 
based on the assumption that the respiratory capacity would not be influenced by the 



rs c 



rs, r 2 



Oand 



ethanol level (ro = ^o +k so mat ' following d8ab . r\ 
S S p = S cr u). As this assumption is not correct in practice, ethanol is produced dur- 
ing the batch, thus inhibiting the respiratory capacity and affecting the optimal glucose 



K iE 



SO 



level, and the biomass growth rate is lower than expected (ro = [io _t K K . +E 
that, following ©, n = rs crit < r^, f2 / and S sp ^ S cr it)> A simple regulation 
strategy, i.e., a regulation that does not adapt the glucose setpoint according to the respi- 
ratory capacity variations, does not allow to avoid the production of ethanol, leading to 
a poor level of productivity (while, as demonstrated in the following, more than 80 g/l 
of biomass can be obtained within 30 h with glucose setpoint adaptation, only 30 g/l 
are obtained in Fig.0. 




Fig. 3. Simulation of a fed-batch process controlled at a constant S sp value 



Consequently, after a trivial mathematical manipulation of (I8ab using the Monod law 
- — , a relation between the critical substrate concentration level and the cell 



respiratory capacity is obtained as: 



K s r 



sl^s ~ ro 



(9) 



Fig. [4] shows a plot of this relation where the point [0, 0] corresponds to a totally inhib- 
ited respiratory capacity, preventing any growth, and the point [ro rnax iS cr it rnax \ cor- 
responds to maximum productivity (i.e. absence of metabolite product in the culture 
medium and a sufficient level of oxygenation). Obviously, the presence of the byprod- 
uct in the culture medium can decrease the respiratory capacity and in turn the value 
of the critical substrate concentration. Moreover, the estimation of the critical substrate 
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level S cr it requires additional measurements (P, O) and a perfect knowledge of Ks, 
k os , fis, Ko, Mo and Kio, which are generally uncertain. 

In order to maintain the system at the edge between the respirative and respiro- 
fermentative regimes, it would be necessary to determine on-line an estimation of the bi- 
ological threshold S cr it and to control the substrate concentration in the culture medium 
around a setpoint S sp ideally equal to S cr u in order to reach the optimal operating con- 
ditions Q. 



0.025 



0.02 




3 4 5 

r [g/g/s] 



Fig. 4. Critical substrate level (S cr it), separating the two regimes, as a function of the respiratory 
capacity (ro) 



2.2 A Practical Suboptimal Strategy 

The maximum of productivity is obtained at the edge between the respirative and 
respiro-fermentative regimes, where the quantity of by-product is constant and equal 
to zero (VP = 0). Unfortunately, evaluating accurately the volume is a difficult task 
as it depends on the inlet and outlet flows including F in but also the added base quan- 
tity for pH control and several gas flow rates. Moreover, maintaining the quantity of 
byproduct constant in a fed-batch process means that the byproduct concentration has 
to decrease while the volume increases. So, even if the volume is correctly measured, 
VP becomes unmeasurable once P reaches the sensitivity level of the byproduct probe. 
For those practical limitations, a suboptimal strategy is elaborated through the control 
of the byproduct concentration around a low value P* depending on the sensitivity 
of commercially available probes (for instance, a general order for ethanol probe is 
0.1g/l), and requiring only an estimation of the volume by integration of the feed rate. 
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The basic principle of the controller is thus to regulate the byproduct at a constant 
low setpoint, leading to a self-optimizing control in the sense of 1 14] and ensuring that 
the culture operates in the respiro-fermentative regime, close to the biological optimum, 
i.e., close to the edge with the respirative regime. 



3 Linearizing Control Strategy 

The component- wise mass balances of reaction scheme (QJ lead to the following state- 
space representation 

x = Kr(x)X + Ax-ux + B{u) (10) 

where x = [X S P O C V)' is the state vector, r{x) = [r\ T2 r^ ]' is the vector of 
reaction rates, and u = D = F in /V is the control input (the dilution rate). The matrices 
K and A, and the vector function B(-) are given by: 



K 



kxi k X 2 


kx3 







-k S i -k S 2 







Oi n U 


k P2 
-koi -k 2 


-kp3 
-k 3 


, B(u) = 



k L a O sat 


kci k C 2 


kc3 




kLCl P S at 














(11) 



A = 



03x3 03x2 03x1 
02x2 — k^CL hx2 02x2 
0lx3 0ix2 



A feedback linearizing controller is illustrated in Figure [3 In a first step, this controller 
is derived assuming a perfect process knowledge. The basic idea is to derive a nonlinear 
controller, which allows a linearization of the process behavior ( 141111 ). 



Bioreactor 




Fig. 5. Linearizing control scheme 
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As the theoretical value of S cr it is very small (below 0.1 g/l) and assuming a quasi- 
steady state of S (i.e. considering that there is no accumulation of glucose when oper- 
ating the bioreactor in the neighborhood of the optimal operating conditions), the small 
quantity of substrate VS is almost instantaneously consumed by the cells ( ^ dt ' w 
and S « 0) and d4bt becomes: 



k S 2T2X = -ksiriX + S in u (12) 

where r\ and V2 are nonlinear functions of 5, P and O as given by d2all2bt . 
Replacing v^X by (fT2l) in the mass balance equation for P ficK we obtain: 

P = -^nl - fcp 3 r 3 X - u (p - k -^S m ) (13) 

&S2 V k S2 J 

A first-order linear reference model is imposed: 

d{P * d ~ P) = -HP* - P) , A > (14) 

and a constant setpoint is considered so that: 

dP 

— = A(P* - P) , A > (15) 

Equating ([T5l) and (fT3l) , the following control law is obtained: 

m fep 2 C _ D 

/CS2 ° iU ^ 

where kp £ ksi n and Zcp3r3 , the kinetic expressions, contain several uncertain parameters. 

3.1 A Classical Adaptive Strategy 

In (31, the parameter uncertainties are handled using an on-line estimation of the kinetic 
term kp ^ ksi n + kpsrs in the linearizing control law (fT6l) . 
The main operating assumptions are summarized as follows: 

- The byproduct concentration (in this case, ethanol) and the dissolved oxygen and 
carbon dioxide concentrations are available; 

- An exhaust gas analysis (adding 3 new measurements: OTR, CTR and Qe, the 
gaseous ethanol outflow rate) is available; 

- The stoichiometric parameters are known; 

- Kinetics are unknown. 

In this earlier work, the biomass is not measured on-line. Nevertheless, nowadays, many 
biomass probes are readily available and not as expensive as a gas analyzer could be 
(for instance, a turbidimetric probe or a conductance probe) so that, in this paper, the 
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biomass concentration measurement X replaces the gas analysis and the dissolved car- 
bon dioxide measurement. Note that in [4], these measurements are used by an asymp- 
totic observer to estimate X. The following adaptive scheme is therefore a simplified 
version of the original algorithm. 

F in = V* P ;- P) + §X (17) 

in /CP2 C _ P 

k S 2 ° in r 

A direct adaptive scheme as described in [ 2 ] is used. Consider the following Lyapunov 
function candidate: 

nt) = Up 2 + -) (is) 



where P = P* — P,0 = 6 — 6 and 7 is a strictly positive scalar. The specific growth 
rates n and r% (and, of course, the pseudo-stoichiometric coefficient k^) are assumed 
to be constant so that 6 variations are negligible (^| = 0). 

Using the Lyapunov stability theory, the time derivative of the Lyapunov candidate 
function should be negative for the closed-loop system to be stable: 

dV dP~ ~d0 1 
at dt dt 7 

Considering ([T5]) and a possible parameter mismatch (6^6): 

dP ~ 

— = -XP-0X (20) 

dt 




so that dT9t becomes: 

^ = -\P*-Pex-ei 1 - (2i) 

dt dt 7 

Choosing the following adaptive law cancels the second and the third terms: 

f = IXP (22) 

dt 

Notice that the adaptive law is asymptotically stable as the negativeness of ^ in ([T9t 
is guaranteed by (l22l) . 

3.2 A Robust Strategy 

Structural and parametric uncertainties can be lumped into a global parametric error: 

5 = 6-6 (23) 

where S is a nonlinear function of (5, P, O) representing possible inexact cancellations 
of nonlinear terms due to model uncertainties and 6 represents the hypothetical exact 
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unknown value. Rewriting the kinetic term in ( TTvT) using the new expression taken from 
d23K we obtain: 

k S 2 ° in r 

which corresponds to the perturbed reference system: 

P = A(P* -P)-5X (25) 

Borrowing the ideas of the Quasi-LPV approach [10], we bound the time- varying pa- 
rameter S which is supposed to belong to a known set A := {5 : 5 < S < 6} with 5 and 
S respectively representing the minimal and maximal admissible uncertainties. 

The parameter A is designed to ensure some robustness and tracking performance to 
the overall closed-loop system, which is modeled as follows: 

ka f P = -Xz-SX 
M: \z = P*-P (26) 

where z = P* — P is the performance output. 

Let w = [P* X] f c £ 2? [o,t] ^ e tne disturbance input to the system M, a(A, S) = 
[A — S ] and c = [ 1 ] . The closed-loop system d26t can be rewritten: 

M . )P = -AP + a(A, *)w; (27) 



{ 



z = -P + cw,SeA 



Consider the finite horizon (for instance, between the instant and the time T) £2 -gain 
of system M |9], representing the worst-case of the ratio of ||^|| 2 ,[o,t] (i- e -> the finite 
horizon 2 -norm of the tracking error) and ||w||2,[o,T] (i.e., the finite horizon 2 -norm of 
the disturbance input), which is defined as: 

II KA II IMk[0,T] /OQ . 

||M^||oo,[0,T] = SUp -—-r+ L (28) 

6eA,0^wC£ 2 ,[o,T] IMl2,[0,T] 

Thus, the parameter A is designed based on the Hoo control theory 191151 . Let a > be 
an upper limiting of || M wz ||oo,[o,t]- The problem is to find a such that: 

min a : H-M^Hoo r ,T] < ol (29) 

\,8(zlA. 

while ensuring the robust stability of system ([27]) . 

This optimization problem can be written in terms of linear matrix inequalities 
(LMIs) and solved using readily available toolboxes, e.g., SeDuMi [17] can be ap- 
plied to solve the problem. These constraints can be easily obtained via a quadratic 
Lyapunov function [ 3 1 

V(P) = P'QP (30) 

where Q is a strictly positive symmetric matrix (i.e., Q — Q' y 0) and " ' " corresponds 
to the transposition matrix operation. 
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The minimization in (|29t is then equivalent to: 



min a : V(P) y , V(P) + -z'z - aw'w -< 

a 

where, using d27t and d30K the time derivative of V(P) is given by: 

T>(P) = P'QP + P'QP 

= (-AP + aw)'QP + P'Q{-\P + aw) 

= -XP'QP + (aw)'QP - XP'QP + P'Qaw 

= -2\P'QP + a'w'QP + P'Qaw 

Using d32l) in d3TK the following expression is obtained: 



(31) 



—2m Qa 



-zz' -< 



(32) 



(33) 



where m = AQ and 7 n ^ is the unity matrix of dimension n w x n w and n w is the 
dimension of w. 

Now, consider the following lemma (Schur Complement): 

Lemma 1. The following matrix inequalities are equivalent 

(i) T>o,i?-5r^Vo 

(ii) R>0,T-S , R- 1 SyO 
R S~ 



(in) 



S'T 



yO 



Hence, using the expression of z, a and c in d27l) and Lemma 1, the optimization prob- 
lem in d29t can be written as follows: 

min a : a>0,(3 = Q / >0 and 

Q,m 

-2m m —SQ — 1 

m —a 1 
-SQ -a 

-11 -a 



^0 



(34) 



If there exists a feasible solution to the above optimization problem for all 8 evaluated 
at the vertices of A, then (|29t is satisfied and A = mQ" 1 . 

Remark 1: Quadratic Lyapunov functions may be conservative for assessing the stabil- 
ity of parameter-dependent systems |5|. However, a parameter-independent Lyapunov 
function is considered in this study for two main reasons: 

1. A is parametrized with the Lyapunov matrix Q so as to obtain a convex design 
condition. A parameter-independent matrix Q therefore results in a parameter- 
independent control law; 

2. the variation of S is a priori unknown. 
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Remark 2: This method is likely to be conservative, as the parameter S has to bound 
the nonlinearities of the inexactly cancelled terms. Less conservative results can be 
obtained by considering the approach of [ 6 1 to deal with the nonlinearities at the cost of 
a larger computational effort. 

4 Numerical Results 

In this section, for comparing the adaptive and robust linearizing control strategies, sev- 
eral numerical simulations considering small-scale bacteria and yeast cultures (respec- 
tively in 5 and 20 [I] bioreactors) are performed. The first simulation set is dedicated 
to yeast cultures with initial and operating conditions: Xq = OAg/l, So = 0.5g/l, 
E = O.Sg/h O = O sat = 0.0350/Z, C = C sat = 1.286^/Z, V = 6.8/, S in = 
350g/l. The second simulation set is dedicated to bacteria cultures with initial and op- 
erating conditions: X = OAg/l, S = O.Obg/l, A = 0.8g/l, O = O sat = 0.035g/l, 
Co = C sat = 1.286g/l, V = 3.5/, S in = 250g/l. 

The values of all model parameters are listed in Tables [HO [3] and [4] Note that, for 
yeast cultures, coefficients k os and k oa are simply replaced by koi and ko3 while k<j2 = 
0, in accordance with the model of [16]. For the bacteria model, parameters values are 
taken from [13] and slightly modified to adapt the yield coefficient normalization to 
the proposed reaction scheme (QJ and kinetic model (with a slight difference in the 
formulation of rs). 

The state variables are assumed available (i.e., measured) online for feedback. The 
adaptive and robust linearizing feedback controllers proposed in section [3] aim at track- 
ing the byproduct set-point (E* and A* = 1 g/l) which is chosen sufficiently low 
so as to stay in the neighborhood of the optimal trajectory but also sufficiently high 
to avoid probe sensitivity limitations. In this setup, a noisy byproduct measurement is 
considered. 

To design the parameter A in d25b via the optimization problem d29k the parameters 
Ks, Kp, Ko and K ip , jjls, /io are assumed to be respectively varying with standard 

Table 1. Yield coefficients values of Sonnleitner and Kappeli for S. cerevisiae model 1 16 1 



Yield coefficients 


Values 


Units 


kxi 


0,49 


g of X/g of S 


k X 2 


0,05 


g of X/g of S 


kx3 


0,72 


g of X/g of E 


ksi 


1 




ks2 


1 




k P2 


0,48 


g of E/g of S 


k P3 


1 




koi 


0,3968 


g of 2 /g of S 


ko2 





gof0 2 /gofS 


ko3 


1,104 


gof0 2 /gofE 


kci 


0,5897 


gofC0 2 /gofS 


kc2 


0,4621 


gofC0 2 /gofS 


kc3 


0,6249 


gofC0 2 /gofE 
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Table 2. Kinetic coefficients values of Sonnleitner and Kappeli for the S. cerevisiae model 1 16 1 



Kinetic coefficients 


Values 


Units 


Mo 


0,256 


gof0 2 /gofX/h 


MS 


3,5 


g of S/g of X /h 


Ko 


0,0001 


gof0 2 /l 


K s 


0,1 


9 of S/l 


K E 


0,1 


9 of E/l 


K iE 


10 


9 of E/l 



Table 3. Yield coefficients values of Rocha's E.coli model |[T3l 



Yield coefficients 


Values 


Units 


kxi 


1 




k X 2 


1 




kx3 


1 




ksi 


0,316 


g of S/g of X 


ks2 


0,04 


g of S/g of X 


kp 2 


0,157 


g of A/g of X 


k P3 


0,432 


g of A/g of X 


koi 


0,339 


gof0 2 /gofX 


ko2 


0,471 


gof0 2 /gofX 


ko3 


0,955 


gof0 2 /gofX 


kci 


0,405 


gofC0 2 /gofX 


kc2 


0,754 


gofC0 2 /gofX 


kc3 


1,03 


gofC0 2 /gofX 


K>os 


2,02 


gof0 2 /gofX 


fcoa 


1,996 


gof0 2 /gofX 



Table 4. Kinetic coefficients values of Rocha's E.coli model [ 13 1 



Kinetic coefficients 


Values 


Units 


Mo 


0,7218 


gof0 2 /gofX/h 


Ms 


1,832 


g of S/g of X /h 


Ko 


0,0001 


gof0 2 /l 


K s 


0,1428 


9 of S/l 


K A 


0,5236 


9 of A/l 


K iA 


6,952 


9 of A/l 



deviations of 500% and 35% of their nominal values. Simulating the operating condi- 
tions of the control strategy in (l24lk we may infer that S = 0.5/3600 8 _1 and ^=0s _1 
for yeast cultures and S = 0.1/3600 s _1 and 5 = s -1 for bacteria cultures. In light 
of ([2Tb and d29k these constraints lead to A = 0.0056 and A = 0.0046 for yeasts and 
bacteria, respectively. 

Concerning the adaptive control law, A = 1 and 7 = 0.05 for yeast cultures while 
A = 2 and 7 = 0.25 for bacteria cultures. Note also that the sampling period is chosen 
equal to 0.1 h. 



Adaptive and Robust Linearizing Control Strategies for Fed-Batch Cultures 297 



10r 

8 - 
^ 6 - 

LU 4 - 

2 - 
0- 






10 



15 



20 



25 



x10~ 




10 15 

Time [h] 



Fig. 6. Yeast cultures - ethanol concentration and feed rate when the controller is designed us- 
ing a plain linearizing control approach (no adaptation and no robustification) in the presence of 
modeling errors 



100 



o) 60 




10 15 

Time [h] 

Fig. 7. Yeast cultures - biomass concentrations of 50 runs with random parameter variations and 
a noise standard deviation of 0.15 [g/l] using the robust control strategy (i.e., A = 0.0056) 

Before discussing the results of the proposed methods, it is interesting to observe the 
performance of a plain linearizing controller, i.e. without adaptation or robustification, 
applied to the yeast process in the presence of modeling errors. For instance, consider 
the situation where the user selects a relatively high gain A = 1, and 6 is fixed to 
&P2/2. Figure [6] illustrates the consequences of such choices. Even if the controller 
behaves correctly during the first hours, the divergence of the ethanol signal during the 
last hours will impact the quality of the culture. 
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Figure [7] shows now the closed-loop response of biomass X and figure [8] ethanol E 
concentrations, for 50 different values of the kinetic parameters (which were randomly 
chosen) in yeast cultures under a robust control strategy. In all simulation runs, a white 
noise is added to the ethanol concentration measurement with a standard deviation of 
0.15 [g/l] and the culture is considered as always evolving in the optimal operating 
conditions in which n = ^— - and r% = so that the hypothetical parameter 6 in (l24t 
is taken as 

fcp2&Sl 



kp2ksi 



n + k P3 r 3 



-Ho 



(35) 



ks2 l ' ' koi 

In Figure [TJ the different curves are more or less indistinguishable except in the last 
hours where the consequences of model errors appear. Nevertheless, these results are 
very satisfactory as model errors have a negligible influence even if, in figured! ethanol 
concentrations may vary from 0.5 to 1.5 g/l when the high biomass dynamics is coupled 
to important model errors. In order to enhance the idea of a negligible influence of these 
last variations, figure[9] shows the different productivity levels of each run. It is obvious 
that, from an operating point of view, the results are satisfactory as the productivity 
remains higher than 98 % of the optimal value. 

Figures [10| and [TT] show the results of a simulation performed with the same initial 
and operating conditions with the adaptive strategy, in the ideal case where there is no 
measurement noise, whereas Figures[T2land[T3lcorrespond to a noise standard deviation 
of 0.05 [g/l] added to the ethanol concentration measurements. 

Due to sensitivity problems of the adaptive law, higher noise levels usually lead to 
computational failures. A way to overcome part of this sensitivty problem is proposed in 
|4|. The dynamics of ethanol is increasing as the biomass grows, so that the linearizing 
reference model should take these dynamics into account under the following form: 



d(P* - P) 
Jt 



-(Ai+A 2 X)(P*-P) Ai,A 2 >0 



(36) 




10 15 

Time [h] 

Fig. 8. Yeast cultures - ethanol concentrations of 50 runs with random parameter variations and 
a noise standard deviation of 0.15 [g/l] using the robust control strategy (i.e., A = 0.0056) 
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Fig. 9. Yeast cultures - productivity levels of the 50 runs with random parameter variations using 
the robust control strategy (i.e., A = 0.0056) 
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Fig. 10. Yeast cultures - adaptation and biomass concentration - adaptive control strategy - no 
measurement noise 



In expression d36K the time constant decreases as the system dynamics (represented 
by the biomass growth) increases. The new linearizing control law becomes: 



V 



(Ai+A 2 X)(P*-P) + 0X 



fcp2 q. 
k S 2 ° 11 



(37) 
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Fig. 11. Yeast cultures - ethanol concentration and feed flow rate - adaptive control strategy - no 
measurement noise 
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Fig. 12. Yeast cultures - adaptation and biomass concentration - adaptive control strategy 
noise standard deviation of 0.05 [g/l]. 



Nevertheless, Figure [14] shows that the adaptive method is still sensitive to the noise 
level as simulation failures occur for standard deviations higher than 0.12bg/l (i.e., 
below the noise level used for the robust strategy). 
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Fig. 13. Yeast cultures - ethanol concentration and feed flow rate - adaptive control strategy - 
noise standard deviation of 0.05 [g/l] 
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Fig. 14. Yeast cultures - ethanol concentration, feed flow rate and parameter adaptation for 5 
different noise levels going from 0.025 to 0.125 g/l. Ai = 1 and A2 = 0.2 
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Fig. 15. Bacteria cultures - biomass concentrations of 50 runs with random parameter variations 
and a noise standard deviation of 0.15 [g/l] using the robust control strategy (i.e., A = 0.0046) 
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Fig. 16. Bacteria cultures - acetate concentrations of 50 runs with random parameter variations 
and a noise standard deviation of 0.15 [g/l] using the robust control strategy (i.e., A = 0.0046) 



However, when the parameter adaptation performs well, the productivity of the adap- 
tive and robust strategies is more or less the same, i.e., a biomass concentration of ap- 
proximately 90 g/l is obtained within 24 hours. 

Figures [151 US] and [TT] show the closed-loop response of biomass X and acetate A 
concentrations, and the productivity levels for 50 different values of the kinetic parame- 
ters which are randomly chosen, in the bacteria cultures under a robust control strategy 
(noise level of a = 0.15 g/l). 
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Fig. 17. Bacteria cultures - productivity levels of the 50 runs with random parameter variations 
using the robust control strategy (i.e., A = 0.0046) 
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Fig. 18. Bacteria cultures - acetate concentration, feed flow rate and parameter adaptation for 5 
different noise levels going from 0.025 to 0.125 g/l. Ai = 2 and A2 = 0.4 

Figure [18] show similar simulation runs as in figure [14] with the adaptive strategy 
where the noise level increases from a = 0.02b g/l to a = 0.125 g/l. The same 
comments as in the yeast case concerning the noise sensitivity apply. 
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Note that, as shown in figure[l5] the productivity is lower in the bacteria cultures (for 
biological and operating reasons, bacteria strains lead to reaction rates and, therefore, 
growth rates that are smaller than yeast reaction rates). However, from a control point 
of view, results of the robust strategy are satisfactory in both cases. 

5 Conclusions 

Linearizing control is a powerful approach to the control of fed-batch bioprocesses. In 
most applications reported in the literature, on-line parameter adaptation is proposed in 
order to ensure the control performance despite modeling uncertainties. On-line param- 
eter adaptation is however sensitive to measurement noise, and requires some kind of 
tuning. On the other hand, robust control provides an easy design procedure, based on 
well established computational procedures using the LMI formalism. Large parametric 
and structural uncertainties, as well as measurement noise levels can be dealt with. 
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Abstract. Application of tools from optimal flow control to the field of computer 
vision and image sequence processing, has recently led to a new and promising 
research direction. We present an approach to image motion estimation that uses 
an optimal flow control formulation subject to a physical constraint. Motion fields 
are forced to satisfy appropriate equations of motion. Although the framework 
presented is flexible with respect to selection of equations of motion, we employ 
the Burgers equation from fluid mechanics as physical prior knowledge in this 
study. To solve the resulting time-dependent optimisation problem we introduce 
an iterative method to uncouple the derived state and adjoint equations. We per- 
form numerical experiments on synthetic and real image sequences and compare 
our results with other well-known methods to demonstrate performance of the 
optimal control formulation in determining image motion from video and image 
sequences. The results indicate improved performance. 

Keywords: Optimal control, Motion estimation, Physical prior knowledge, 
Optimisation. 



1 Introduction 

In this work we are concerned with motion estimation of objects in image sequences. 
The understanding and reconstruction of dynamic motion in image scenes is one of the 
key problems in computer vision and robotics. We present an attempt to adopt con- 
trol methods from the field of applied mathematics in a new form to image sequence 
processing and to provide preliminary evaluations of the capability of this approach. 

We describe motion as the displacement vector field of pixels between consecutive 
frames of an image sequence. In the literature this is known as optical flow (9). In 
computer vision local and global approaches are used to compute the optical flow field 
of image sequences. Local approaches are designed to compute the optical flow at a 
certain pixel position by using only the image information in the local neighbourhood of 
this specific pixel 1 10]. Variational optical flow methods represent global optimisation 
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problems which can be used to recover the flow field from an image sequence as a global 
minimiser of an appropriate energy functional. Usually, these energy functionals consist 
of two terms: a data term that imposes the result to be consistent with the measurement 
(here the brightness constancy assumption) and a regularisation term which imposes 
additional constraints like global or piecewise smoothness to the optical flow field. 

One of the first variational methods for motion analysis was introduced by and 
incorporates a homogeneous regularisation term, where the optical flow is enforced to 
vary smoothly in space. This leads to an undesired blurring across motion discontinu- 
ities. Therefore, regularisation terms were introduced to regularise the flow in an image- 
driven 112115111 or flow-driven 1 16 6] way, where the flow is prevented from smoothing 
across object or motion boundaries, respectively. A systematic classification of these 
approaches can be found in |fT9l . 

Most of the variational approaches incorporate a purely spatial regularisation of the 
flow. However, some efforts have been made to incorporate temporal smoothness |[T3l . 
The work of l20l investigates an extension of spatial flow-driven regularisation terms 
to spatio-temporal flow-driven regularisers. Time is considered as a third dimension 
analogue to the two spatial dimensions. These approaches improve both the robustness 
and the accuracy of the motion estimation but the flow computation involves the data of 
the full image sequence at once. 

Note that all these approaches do not incorporate physical prior knowledge about the 
motion itself. In contrast our approach incorporates a space-time regularisation using 
physical prior knowledge in a control framework that draws on the literature on the 
control of distributed parameter systems in connection with fluid dynamics |7 ]. 

The ideas of two existing control approaches that are related to motion computation 
of image sequences are presented by 1 14] and 1 3 ]. Ruhnau and Schnorr presented an op- 
tical flow estimation approach for particle image velocimetry that is based on a control 
formulation subject to physical constraints (Stokes equation). Their aim is to estimate 
the velocities of particles in image sequences of fluids rather than to estimate motion in 
every day image scenes. 

The basic idea of is to estimate both an optical flow field u and a rectified image 
function / satisfying the brightness constancy assumption. Note that in their approach 
Yk (and not /) denotes the sampled images of the image sequence. The most significant 
difference to our optical flow approach is that they do not only estimate the optical 
flow u, but also Ik which is an approximation of the captured grey value distributions 
Yk, where k specifies the frame number within the image sequence. As part of the first- 
order necessary optimality conditions of the Lagrangian functional their optimal control 
formulation does not require a differentiation of the image data. 

In contrast to that approach, we interpret the grey values of a scene as a "Active fluid" 
- assuming that its motion can be described by an appropriate physical model, in this 
work realised with the Burgers equation of fluid mechanics. We adopt the well estab- 
lished variational optical flow approach of |9 ] and add a distributed control exploiting 
the Burgers equation resulting in a constrained minimisation problem. The obtained 
objective functional has to be minimised with respect to the optical flow and control 
variables subject to the model equation over the entire flow domain in space and time. 
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Our approach estimates not only the optical flow data from an image sequence, but 
it also estimates a force driven by the Burgers equation. The force field indicates the 
violation of the equation and can indicate accelerated motions like starting or stopping 
events or the change of the motion direction. Therefore one can exploit this feature as 
an indicator of unexpected motion events, taking place in the image sequence. 

The initially constrained optimisation problem is reformulated - exploiting Lagrange 
multipliers - into an unconstrained problem allowing to obtain the associated first-order 
optimality system. This results in a forward-backward system with appropriate initial 
and boundary conditions. To solve the optimality system we uncouple the forward and 
backward computation as described in [7 1 leading to an iterative solution scheme. 

2 Approach 

Before we start to describe the approach in more detail we first exemplify the notation 
and components of our control formulation. 

We define a grey value of a certain pixel within an image sequence by a real valued 
one-time continuously differentiable C 1 image function I(x, t), where x = (xi, £2) T 
denotes the location within some rectangular image domain Q and t G [0, T] labels 
the corresponding frame at time t. In particular, the function I(x\,X2,t) denotes the 
intensity of a pixel at position (x\ , x 2 ) T in the image frame at time t. The optical flow 
field is denoted by a two-dimensional vector field u = (ui(x, t), u 2 (x^ t)) T , which 
describes the intensity changes between images. 

We formulate our motion estimation problem within a variational framework. We 
minimise an energy functional E, which consists of a data and a regularisation term: 

Data Term. We make use of the following data term 

(dtl + u-VI) 2 dx, (1) 



which comprises the optical flow constraint 1 9 ] and provides the link between the given 
image data, the observed intensity / and the desired velocity field u. Note that the op- 
tical flow constraint equation represents the requirement that the intensity of an object 
point stays constant along its motion trajectory. Problem (Q]) is ill-posed as any vec- 
tor field u satisfying u ■ VI = —d t I, is a minimiser. Therefore a regularisation term 
is added to introduce additional constraints for the flow field u to obtain an unique 
solution. 

Regularisation Rerm. We incorporate the regularisation term from (5) 

a(\Vu x \ 2 + \Vu 2 \ 2 )dx, 0<aGR, (2) 



/■ 



Q 

to enforce spatial smoothness of the optical flow field, preferring neighbouring optical 
flow vectors to be similar. The regularisation parameter a adjusts the relative impor- 
tance of the smoothness term to the data term. With an increasing value of a the vector 
field is forced to become smoother. We are aware that regulariser like the Ll-regulariser 
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used for example in [18] allows for sharper discontinuities in the flow field. Our deci- 
sion to use the L2-regulariser in the motion estimation was mainly driven by the idea 
to keep the approach clear and numerically simple. However, the replacement of the 
quadratic homogeneous smoothness term could improve the accurateness of the com- 
puted motion boundaries. 

Physical Prior. Considering a constant moving object one can determine that structures 
are transported by a velocity field and along with it the velocity field is transported 
by itself. A physical model equation, which describes this behaviour is the Burgers 
equation and allows to model the movement of rigid objects. 
The inviscid Burgers equation 



— u = d t u + (u • V)u = , u(x, 0) = uq 



(3) 



has been studied and successfully applied for many decades in aero- and fluid dynam- 
ics 14181 as a simplified model for turbulence, boundary layer behaviour, shock wave 
formation and mass transport. It contains the convection term from the fundamental 
equations of fluid mechanics, the Navier-Stokes equations. 

As a physical interpretation, u in ® may be regarded as a vector of conserved (fic- 
tive) quantities or states, with corresponding density functions u±, u<i as components. 
The material derivative -^ yields the acceleration of moving particles. The nonlinear 
term (u ■ V)u is known as the inertia term of the transport process described by ®. See 
Figure Q] for an illustration of the transport. We found that our approach even with the 
constant velocity assumptions of our physical prior predicts the non-uniform motion 
pattern quite well as shown in our numerical results (cf. Sec. 14.21) . 




Fig. 1. Illustration of the transportation of a vector field with equation <[3j at times t = 0, 5, 10. 
Gray values visualise vector magnitudes. Fictive particles move along a shock front in the lower 
right direction. In the absence of any further external information, a region of rarefaction arises 
due to mass conservation, acting like a short- time memory. 



2.1 Optimal Control Formulation 

In the following sections we explain our optimal control approach. Foundations exploit- 
ing fluid dynamical methods can be found in the book of Gunzburger [7 1. 
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We obtain our spatial-temporal control approach as follows: Additionally to the 
smoothness term we introduce a control /, that is distributed in space and time, which 
means that it acts over the entire optical flow domain Q x [0,T]. The magnitude of 
the control is bounded due to penalisation within the objective functional. The resulting 
optimisation problem is to minimise 

E(uJ) = ± J [(d t I + u.VI) 2 + a(\\/ Ul \ 2 ^\Wu 2 \ 2 )^p\f\ 2 }dxdt, (4) 

Qx[0,T] 

subject to the equations of motion 



' d t u + (u • V)u = f in (0, T) x J?, 
d n u = on(0,T)xr. 



(5) 



We intend to find an optimal state u = (u±, U2) T and an optimal control / = (/i, /2) T , 
such that the functional E(u, f) is minimised and u and / satisfy the Burgers 
equation ([5]). 

The objective of this formulation is to determine a body force / (the control !) that leads 
to a velocity field u which fits to the apparent motion in the image sequence, and at the 
same time satisfies physical prior knowledge in terms of the given equations of motion. 



2.2 Optimality System 

In order to obtain the velocity field u and the control / we recast the constrained opti- 
misation problem © - ([5]) into an unconstrained optimisation problem. Introducing the 
Lagrange multiplier or adjoint variable w = (wi (x, t), W2 (x, t)) T yields the following 
Lagrangian functional 

L(u, f, w) = E(u, f) - J w T (d t u + (u ■ V)u - f) dxdt. (6) 

f2x[0,T] 

To solve this functional we have to derive the first-order necessary conditions. This 
results in the following optimality system (jT])-© from which the optimal state u, ad- 
joints w, and the optimal control / can be determined such that L(u, /, w) is rendered 
stationary. 

d t u+{u- V)u = f in J?x [0,T], 

d n u = onTx[0,T], (7) 

u\t=o = ^o in 1?, 



—dtw — (u • \7)w — w\7 • u + (\7U) T w 
VI{d t I + u • VI) - aAu in Q x [0, T] , 

w = onTx[0,T], () 

w\t=T = in i?, 
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/3f + w = ini?x[0,T], 

/ = onfx[0,T], (9) 

f\ t=T = in n, 

where (V£7) T the transposed Jacobian matrix. 

The state equation © is obtained by derivation of the Lagrangian functional © in 
the direction of the Lagrange multiplier, and turns out to be identical to the Burgers 
equation §5§ itself. The adjoint equation ® specifies the first-order necessary condi- 
tions with respect to the state variables u. The optimality condition © is the necessary 
condition that the gradient of the objective function - with respect to the control / - 
vanishes at the optimum. It also includes the initial and terminal conditions. 

In the next section we describe how the optimisation problem can be solved. 

2.3 Solution of the Optimality System 

The optimality system J7]) - © is a coupled system which turns out to be - due to 
the large number of unknowns - prohibitively expensive to solve directly. To solve this 
system we introduce an iterative method which decouples the state and adjoint compu- 
tation. This results in a gradient descent method which consists of the iterative solution 
of the state and adjoint equation in such a way that the state equation is computed for- 
ward in time with appropriate initial condition u$ and the adjoint equation is computed 
backward in time with terminal condition w t =r = 0. The optimality condition is used 
to update the control / with the adjoint variable w. The control / is then used to com- 
pute the actual state u. The step length is adaptively adjusted to ensure that the actual 
energy of the objective functional © is smaller than in the previous iteration. Note that 
we choose the start value for / to be zero in the very first iteration. 

Gradient of the Objective Functional. To determine the optimal state and control we 
use - as mentioned above - a gradient descent method. Therefore, we have to determine 
the gradient of our objective functional © with respect to the control /. The state 
equation © is solved to determine the state u as a function of the control / so that the 
functional J(u, f) = J{u(f), f) is a function of only the control /. Applying the chain 
rule to J (u (/),/) we obtain the total derivative of the functional ©: 

_ _ dJdu dJ ._. 

d ^ J = ^w + dj- (10) 

We specify in detail the gradient %. This term defines the so-called sensitivities and 
describes the change of the state when the control variable changes. To obtain this 
dependency we can use a variation of the distributed control / + ef which is assumed 
to correspond to the changes of the state utou + ev,f. The change Uf is determined by 
the state system: 

(dt(u + euf) + ((u + euf)'V)(u + eu f ) = f + ef in Q x [0,T], 
\ d n (u + euf)=0 onTx[0,T]. 



A New Framework for Motion Estimation in Image Sequences 313 

One obtains for e — > the sensitivity equation 

f d t Uf + (u f ■ V)u + (u ■ V)u f = f in Q x [0,T], n n 

\ <9 n ^ = onrx[0,T], l } 

which describes the fact that an infinitesimal variation of the control in the direction of 
/ induces an infinitesimal variation in the direction of the local velocity Uf . 

The change in the functional J(u,f) of ©, effected by an infinitesimal change in 
the direction / in the control / leads to the gradient of the objective functional: 

d f J = w + Pf. (12) 

This gradient is used within our iterative gradient descent algorithm to update the search 
direction. Below we sketch the implementation of the gradient method. 

2.4 Algorithm 

We solve the optimality system IT])-© using an iterative gradient descent method (with 
step length adoption) which decouples the state and adjoint computation. It consists of 
the iterative solution of the state and adjoint equation in such a way that the state equa- 
tion is computed forward in time with appropriate initial condition uo and the adjoint 
equation is computed backward in time with terminal condition w t =T = 0. The opti- 
mality condition is used to update the control / with the adjoint variable w. The control 
/ is then used to compute the actual state u. Additionally, the step length is adjusted 
ensuring that the actual energy of the objective functional © decreases. Note that we 
choose the start value for / to be zero in the very first iteration. 

In our pseudo code description of Algorithm Q] variable s denotes the step-size that 
is adapted by the algorithm and e the threshold which is used to decide if the relative 
difference of the energy is small enough to be seen as converged. 

In the initial step of the algorithm the flow fields u for all consecutive image frames 
and the terminal condition of the adjoint variable for the last frame (w t =T) are set to 
zero. The first step of the iteration loop solves the adjoint equation (8) for w backwards 
in time using the terminal condition on w and the flow field u. Then, the optimality 
condition (9) is used to update the control field for all frames, allowing the state equation 
(8) to be solved for u forward in time using the new control field. The iteration loop 
continues until the decline in E is negligible. 

3 Numerical Solution 

In this part, we summarised the numerical discretisation methods employed in solving 
the optimality system ([T])-©. For more details, we refer to Q. 

Discretisation of the State Equation. Within the numerical implementation of the non- 
linear state system equation J7]) we have to cope with over- and undershoots, with shock 
formations, with the compliance of conditions (entropy-, monotony-, CFL-condition, 
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Algorithm 1. Gradient algorithm with automatic step-length selection. 



1 

2; 
3: 
4 
5: 
6 
7: 
8 
9 

10: 
11 
12 



set u = 0, e = 10 _ , and s := so (initial step) 
repeat 

solve the adjoint equation ([§]) for w 
update /: / m = f m -i - s(/3f m -i + w) 
solve the state equation for n 
ifE(u,fm) > E(u,f m -i) then 
s := 0.5s 
GOTO0] 
else 

s := 1.5s 
end if 
until|#(u,/ m )-£(^ m _i)|/|#K/ m )| <e 



etc.) and different discretisation schemes. We use the second-order conservative Go- 
dunov scheme for our implementation. The fluxes are numerically computed by solv- 
ing the equations at pixel edges. The correct behaviour at discontinuities is obtained by 
using solutions of the appropriate Riemann problem. 

Discretisation of the Adjoint Equation. The numerical implementation of the time- 
dependent adjoint system ([5]) in the domain Q is done by using a second-order predictor- 
corrector finite difference scheme. The basic idea behind this is that all methods with 
an accuracy larger than the order one will produce spurious oscillations in the vicinity 
of large gradients, while being second-order accurate in regions where the solution is 
smooth. To prevent such oscillations the slopes of Fromm's method are replaced by 
the slopes of the Van Leers scheme. The Van Leer scheme detects discontinuities and 
modifies its behaviour in such locations accordingly. The implication of this is that this 
method retains the high-order accuracy of Fromm's scheme in smooth regions, but near 
discontinuities the discretised evolution equation drops to first-order accuracy. 

4 Experiments 

In this section we first illustrate the control performance of our optical flow approach on 
a real-world 2D image sequence. Secondly, we evaluate the following motions which 
violate the incorporated motion assumption: rotation, translation in combination with 
scaling. Finally, we present the results for noisy image data showing the influence of 
the temporal regularisation in the control approach and provide a comparison with error 
measures obtained by the approach from [9 ] and the dynamic optical flow approach 
from(n). 

4.1 Control - Force 

We illustrate the control behaviour of our approach for a real- world 2D image sequence 
with an unexpected motion. The image sequence consists of 10 image frames and shows 
a moving hand which starts to move and then stops again. Figure [2] depicts the starting 
(left column) and stopping (right column) event of the sequence. The first row shows 
the velocity estimates u, and the second row shows the force fields. The force field / 
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Fig. 2. "Waving hand" sequence: Unexpected events. Top: A waving hand stops. The estimated 
optical flow field u for a starting (left) and stopping (right) event is depicted in blue. Bottom: 
The corresponding control field / is shown in red. The force acts when the hand starts to move 
(left) and reacts into the opposite direction of the flow field (right) when it stops and forces the 
flow field into the observed state of no motion (parameters: a = 0.01, /3 = 0.0001). 



nicely indicates the deviation of the expected motion from the observed motion. This 
is evident in the second row of Figure 4, where the force field acts in the direction of 
the moving hand as the hand accelerates into motion (left picture), while it turns in the 
opposite direction as the hand stops (right picture). 

4.2 Non-uniform Motion 



In this section we provide an evaluation of our approach on the basis of two well known 
synthetic image sequences for which the ground truth motion data is available. To allow 
for a quantitative comparison we provide the results we obtain for the Horn and Schunck 
as well. The image sequences we use show global motion patterns such as rotation, 
translation and divergence. 

In particular we evaluate our approach on the grey value versions of the following 
two image sequences: the "rotating sphere" sequence 1 1 1 ] and the "Yosemite" sequence 
(available at ftp : //ftp . csd.uwo . ca/pub/vision). 
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Fig. 3. The synthetic "rotating sphere" sequence. The sphere rotates in front of a stationary back- 
ground. Left: Gray value version of frame 6 that is used in our computations. Right: Vector plot 
of the ground truth data. 



Fig. 4. The synthetic "rotating sphere" sequence. Computational Results for the Horn and 
Schunck approach and the control based approach. Left: Result Horn and Schunck (RMSE = 
0.395). Right: Result control based approach (RMSE = 0.192). 



The "rotating sphere" sequence contains a curling vector field and is shown in 
Figure This sequence consists of 45 frames, where a sphere rotates in front of a 
stationary background. 

The computed vector fields obtained by the Horn and Schunck approach and our 
approach ©-([5]) for the "rotating sphere" sequence is shown in Figured 

The motion estimation results for the "Yosemite" sequence are shown in section l43l 

The results show that even for sequences which violate the constant velocity assump- 
tion of the model equation we obtain good results. However, due to the flexibility of our 
variational approach it should be possible to model such motion patterns by incorpora- 
tion of a suitable model equation. 
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Fig. 5. Left: Yosemite sequence. Centre: We added Gaussian noise with standard deviation a = 
40. Right: The shown high quality optical flow field is obtained by the control based optical flow 
approach @} - (parameters: a = 0.05 and f3 = 0.000003). 



4.3 Temporal Regularisation 

To investigate the impact of the temporal regularisation to the robustness of our ap- 
proach under noise, we choose the "Yosemite" sequence with different Gaussian noise 
levels a = 0, 10, 20 and 40 (cf. Fig.©. 

The sequence exhibits divergent and translational motion combined with illumina- 
tion changes. To investigate the performance of our approach we compare the root mean 
square error (RMSE) 



RMSE(u ,u e ) = —y / \/(u - u e ) 2 dx 



(13) 



and the average angular error (AAE) 



AAE(u Q ,u e ) = — 



Q 



arccos 



n 



u • u e 

\u \\u e \ 



dx, 



(14) 



where | • | denotes the Euclidean norm, u Q = (u 0l ,u 02 , 1) T the original optical flow 
vectors, and u e = (u ei , u e2 , 1) T the estimated optical flow vectors (compare [2]). Note 
that the time dimension is set to 1 corresponding to the distance of one frame. 

This measure is currently used as a kind of standard to provide accuracy measures 
for optical flow results. 

We compare the errors of the optical flow computation obtained for three different 
approaches with optimised parameters. In particular these are the homogeneous spatial 
regularised approach from [9], the spatio-temporal dynamic image motion approach 
from ifTTlL and our control based image motion approach © - (0). The control approach 
results in a improved vector field, which is based on the forward-backward computation, 
which incorporates additional knowledge of the future frames leading to an improved 
temporal regularisation. 

The results for the computed errors (RMSE and AAE) for all three approaches with 
increasing noise level are shown in Table \T\ The purely spatial regularised approach 
from Horn and Schunck and the absence of physical prior knowledge leads to the higher 
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Table 1. Performance of our control approach (C) in comparison with the Horn and Schunck 
approach (HS) and the dynamic image motion approach (Dy) in presence of noise: We added 
random Gaussian noise with zero mean and standard deviation a — 0, 10, 20, and 40 to the 
Yosemite image sequence 



noise 


approach 


a 


P 


RMSE 


AAE 


a = 


HS 


0.005 


- 


0.177 


3.04 u 


Dy 


0.006 


0.00002 


0.178 


3.09 u 


C 


0.007 


0.0005 


0.169 


2.88 u 


cr= 10 


HS 


0.008 


- 


0.283 


5.74 u 


Dy 


0.01 


0.0003 


0.275 


5.68 u 


C 


0.009 


0.0001 


0.243 


4.92^ 


(T = 20 


HS 


0.02 


- 


0.429 


8.61° 


Dy 


0.025 


0.001 


0.395 


7.54^ 


C 


0.02 


0.00001 


0.350 


6.67 u 


<T = 40 


HS 


0.05 


- 


0.640 


13.27 u 


Dy 


0.05 


0.005 


0.523 


9.89 u 


C 


0.05 


0.000003 


0.497 


9.16 U 



error values with increasing noise levels. In contrast to the spatio-temporal dynamic im- 
age motion approach [ 17 ] a higher noise level requires the selection of a smaller (3 reg- 
ularisation parameters for the control part of the objective functional. The consistently 
lower error indicates an improved global motion prediction in our control approach ©- 
^ exerting a better temporal regularisation. Our explanation for this observation is that 
the control approach incorporates also future knowledge of the image sequence instead 
of using only past information with a prediction as in ifTTl . 

5 Conclusions 



We have presented an optimal control approach to image motion estimation including 
physical prior knowledge in a novel and exploratory way. It leads to an unconstrained 
optimisation problem, where the optimality system - from which the optimal state and 
the optimal control are determined - can be solved using an iterative gradient descent 
method. The forward-backward structure of the model allows for a robust estimation of 
the coherent flows by including prior knowledge that enforce spatio-temporal smooth- 
ness of the minimising vector field. 

In the case that the image measurements indicate changes of the current velocity 
distribution, Active control forces modify the system state accordingly. The presence of 
such forces may serve as an indicator notifying a higher-level processing stage about 
unexpected motion events in video sequences. 

The comparison of our results with the approach from [9 ] and the approach from 
1 17] demonstrates the ability of the control formulation to determine image motion 
from video sequences, and shows improved performance, especially for highly noisy 
image data. Our further work will include the modification of the Burgers equation to 
achieve better motion boundaries in the rarefaction area and the reformulation of the 
approach to a receding horizon formulation. 
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Abstract. This paper presents a new design procedure of the pole placement and 
the Luenberger observer for linear time- varying discrete systems. First, a simple 
calculation method to derive the pole placement state feedback gain for linear 
time-varying discrete systems is proposed. It is shown that the pole placement 
controller can be derived simply by finding some particular "output signal" such 
that the relative degree from the input to this new output is equal to the order of 
the system. Using this fact, the feedback gain vector can be calculated directly 
from plant parameters without transforming the system into any standard form. 
In general, the equivalent transformation of the time- varying linear system does 
not preserve the stability. The paper also shows the condition of the stability of 
the closed loop system. Then, this method is applied to the design of the observer. 
For this purpose, the anti-causal system is introduced. Using the dual property of 
this system, the observer gain vector is calculated by the proposed simple design 
technique of the pole placement control. 

Keywords: Pole-placement, Observer, Linear time- varying discrete system. 



1 Introduction 

The observer based pole placement control is the well established basic strategy of lin- 
ear control systems 1 1 1 |2|. As for time-invariant systems, the pole placement state feed- 
back gain for continuous linear time- varying systems can be obtained by tranforming 
the system into the controllability standard form (4). However, in the time- varying 
case, this calculation procedure is complicated. So, for the simplicity, Ackermann-like 
pole placement algorithm was proposed in [5|. Similarly, the derivation of Ackermann- 
like algorithm from the point of the relative degree and its application to the design of 
the observer was also proposed |9|. 

This paper considers the simplified design procedure of the pole placement control 
and the observer for linear time- varying discrete systems. This method can be regarded 
as "a discrete Ackermann-like algorithm". The basic problem is to find a time- varying 
state feedback gain for linear time-varying discrete systems, so that the closed loop 
system is equivalent to some time-invariant system with desired constant closed loop 
poles. It will be shown that the pole placement controller can be derived simply by 
finding some particular "output signal" such that the relative degree from the input to 
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this output is equal to the order of the system. Then, the feedback gain vector can be 
calculated directly from the system parameters without transforming the system into 
any standard form. It should be noted that, in general, even some linear time- varying 
system is stable, its equivalent system is not necessarily stable. To preserve the stability, 
the Lyapunov transformation is needed. The paper shows the condition for the pole 
placement closed loop system to be stable. 

The simple design procedure of the design of the observer is also presented in this 
paper. It will be shown that, as for the time-invariant case, the observer can be obtained 
by solving the pole placement problem for the particular anti-causal time- varying sys- 
tems. Because of the duality, the simplified pole placement technique can be directly 
applied to the design of the state observer for linear time- varying discrete systems. 

In the sequel, some basic properties of linear time- varying discrete systems are pre- 
sented in Section 2. The simple pole placement technique is proposed in Section 3, and 
then, this method is used to the observer design problem in Section 4. 

2 Preliminary 

Consider the following linear time- varying discrete system with a single input and a 
single output. 

x(k + 1) = A(k)x(k) + b(k)u(k) (1) 

y(k) = c T (k)x(k) (2) 

Here, x G R n , u G R 1 and y G R 1 are the state variable, the input signal and the 
output signal respectively. A(k) G R nxn ,b(k) G R n and c(k) G R n are time- varying 
bounded parameter matrices. 

The transition matrix from k = j to k = i of this system, <£(i, j), is defined by 

$(i, j) = A(i - l)A(i - 2) • • • A(j) i > j. (3) 

Definition 1. The system (1) is called completely reachable in step n from the origin, 
if for any x\ G R n , there exists a finite input u(m) (m = fc, • • • , k + n — 1) such that 

x(k) = and x(k + n) = x\ . 

Definition 2. UR(k) defined by the following equation is called the controllability 
matrix. 

U R (k) = [b(k + n - l),$(k + n,k + n - l)b(k + n - 2), 

■..,#(fc + n,fc + l)&(fc)] (4) 

Lemma 1. The system (1) is completely reachable in step n from the origin, if and 
only if 

rank Un(k) = n, v /c. (5) 

Definition 3. The system (1),(2) is called completely observable in step n, if from y(k), 
y(k + 1), • • •, y(k + n — 1), the state, x(k), can be determined uniquely for any k. 
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Definition 4. U (k) defined by the following equation is called the controllability 
matrix. 



U (k) 



c T (k) 
c T (/c + l)<£(/c + l,/c) 



c T (k + n- l)$(k + n - 1, k) 



(6) 



Lemma 2. The system (1),(2) is completely observable in step n, if and only if 

rank U (k) = n, Vfc. 

Let T(k) be a non- singular matrix for all k. The change of variable 

x(k) =T(k)w(k) 

transforms the system 

x(k + l) = A(k)x(k) 

into the following equivalent system. 

w(k + 1) = T~\k + l)A(fc)T(fc)w(fc) 
= i(fc)w(fc) 



(7) 



(8) 



(9) 



(10) 



As for the continuous system, the stability of the time-varying system does not neces- 
sarily implies the stability of the equivalent system (6) (7) ifTOl . 

Definition 5. If the transformation matrix T(k), and T _1 (/c) are bounded, (8) is called 
Lyapunov transformation. 

Lemma 3. Lyapunov transformation preserves the stability of the system. 

3 Pole Placement of Linear Time- Varying Discrete Systems 

Consider the system (1), that is 

x(k + 1) = A(k)x(k) + b(k)u(k) (11) 

The problem is to find the state feedback 

u(k) = h T (k)x(k) (12) 

which makes the closed loop system equivalent to the time-invariant linear system with 
arbitrarily stable constant poles. 

Now, consider the problem of finding a new output signal y(k) such that the relative 
degree from u(k) to y(k) is n. Here, y(k) has the following form. 



y(k) = c 1 (k)x(k) 



(13) 



Then, the problem is to find a vector c(k) G R n that satisfies this condition. 
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Lemma 4. The relative degree from u(k) to y(k) defined by (13) is n, if and only if 
c T (k) satisfies the following equations for all k. 

c T (k + l)b(k) = 

c T \k + 2)$(k + 2,fc + l)b(k) = 

: (14) 

c T (/c + n - l)${k + n - 1, fc + l)fc(fc) = 
c T (& + n)^(fe + n, k + l)6(fc) = 1 

(Here, c T (fc + n)<P(k + n, fc + l)6(fc) = 1 without loss of generality.) VV 
Proof : Using (14), y(fc + 1), • • •, y(k + n) can be calculated as follows. 

y(k) = c T (k)x(k) 
y(k + 1) = c T (/c + l)#(fc + 1, fe)x(fe) 

: (15) 

y(k + n-l) = c T (k + n - l)#(fe + n - 2, fe)a:(fe) 
y(A; + n) = c T (/c + n)$(k + n - 1, k)x(k) + u(fc) 

This implies that the relative degree from u(k) to y(k) is n. 
To obtain c T (k), (14) can be modified as follows. 

c T (/c) [ 6(fc - 1), <P(Jfe, fe - l)6(fe - 2), • • • , <P(fc, fc + 1 - n)6(fc - n) ] 
= c r (fc)C/ fl (fc-n) 
= [0, -..,0, 1] (16) 

From this, we have the following Theorem. 

Theorem 1. If the system (11) is completely reachable in step n, there exists a vector 
c(k) such that the relative degree from u(k) to y(k) = c T (k)x(k) is n. And, such a 
vector, c T (k), is obtained by 

c T (fe) = [0, ..., 0, lJC^fc-n) (17) 

The next step is to derive the state feedback for the arbitrary pole placement. The prob- 
lem is to find the time- varying state feedback gain vector so that the closed loop system 
is equivalent to some linear time-invariant system which has arbitrarily stable constant 
poles. 

Let q(z) be a desired closed loop characteristic polynomial of z-operator, i.e., 

q(z) = z n + an-iZ 71 ' 1 + • • • + a (18) 

which has its roots, Xi(i = 1, • • • , n), as desired stable closed loop poles. 

Since the new output, y(k) = c T (k)x(k), with c(k) obtained by (17), satisfies (14), 
by multiplying y(k + i) by ai (i = 0, • • • , n)(a n = 1) and then summing them up, the 
following equation is obtained from (14). 

q(p)y(k) = d T (k)x(k) + u{k) (19) 
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Here, d(k) 6 R n is defined by the following. 



d T (k) = [a ,ai,- ■■ ,a n -i, 1 



c r {k) 
c T (k + l)<P(k+l,k) 

c T (k + n)<l>(k + n,k) 



Hence, the state feedback, 

u = -d T (k)x(k) 
makes the closed loop system as follows. 

q(z)y(k) = 



(20) 



(21) 



(22) 



This control system can be summarized as follows. The given system and the pole 
placement state feedback are as follows. 



x(k + 1) = A(k)x(k) + b(k)u(k) 
u(jfe) = -d T (k)x(k) 

Then, the closed loop system becomes 

x(k + 1) = (^(jfe) - b(k)d T (k))x(k). 

Let T(k) be a time varying matrix denned by 

c T (k) 



T(k) 



c T (k + l)$(k + l,k) 



c T (k + n- l)<P(k + n-l,k) 



(23) 
(24) 

(25) 



(26) 



and define the new state variable w(k) by the following equations. 

x(k) = T(k)w(k), w{k) - 



m 

y(k + l) 



y(k + n — 1) 



(27) 



Then, (22) implies that the change of variable (27) transforms the closed loop system 
(25) into the following system. 

w(k + 1) = T~\k + l)(A(k) - b(k)d T (k))T(k)w(k) 
1 ••• 



-QLO • • 

A*w(k) 



1 

-OLn-l 



w(k) 



(28) 
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This implies that the closed loop system is equivalent to the time-invariant linear system 
which has the desired closed loop poles \i(i = 1, • • • , n) (det(zl — A*) = q(z)). 

Non- singularity of the transformation matrix and the stability of the pole placement 
control system are established in the following Theorems. 

Theorem 2. If the system (11) is completely reachable in step n, then, the matrix for 
the change of variable, T(k), given by (26) is nonsingular for all k. 
Proof : Since, if Uu{k) is nonsingular for all k, equations in (14) are satisfied for all k, 
we have the following equation. 



T(k)U R (k-n) 





1 



1 



(29) 



Here, the right hand side is an anti-triangular matrix with 1 in the anti-diagonal position, 
then, T(k) is nonsingular. 

Theorem 3. The pole placement control system obtained above is stable if T(k) is 
Lyapunov transformation matrix. 

The calculation procedure for pole placement feedback gain is as follows. 

[Calculation Procedure of the Pole Placement] 

Step 1. Calculate the state transition matrix and the controllability matrix by (3) and 

(4). 
Step 2. Calculate c T (k) by (17). 
Step 3. Define the desired closed loop characteristic polynomial by (18) and calculate 

the state feedback by (20) and (21). 
Step 4. Check if T(k) in (26) is Lyapunov transformation matrix. 

Example 1 

Consider the following unstable system. 



x(k + 1) = A(k)x(k) + b(k)u(k) 
y(k) = c T (k)x(k) 



(30) 



where 



A{k) 



b(k) 



1 2 

sin0.2k 



cosOAk 
2 



(31) 



c T (k) = [3 + sm0.5fe0] 
Here, we define Ai(fc), \2(k) and Xs(k) by the followings. 

7i (k) = 2 + cosOAk 

72 (fc) = 2 + sm0.2fe 

73 (fc) = 3 + sinO.bk 



(32) 
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From (17), c 1 (k) is obtained as follows. 



c T (k) = [ 0, l][b(k- 1), A(k - l)b(k - 2) ] _1 
1 ^ 



7i(fc-l) 



The purpose is to design the state feedback so that the closed loop system is equivalent 
to the linear time-invariant system with Ai =0.4 and A2 = 0.5 as its closed loop poles. 
This implies that the desired closed loop characteristic polynomial is 

q(z) = z 2 + 0.9z + 0.2. 



The pole placement feedback is u(k) 
calculated by 



-d T (k)x(k), and, from (20), d(k) is 



d T (k) = [0.2, 0.9, 1] 

= [d 1 (k)d 2 (k)'_ 
where, d\(k) and d 2 (k) are given by 



c T (k) 

c T (k + l)A(k) 

c T (k + 2)A(k + l)A(k) 



(33) 



di(fe) 



0.2 



0.9 



d 2 (k) = 0.9 



7i(fe-l) 7i (fc) 7i(fc + l) 

7i (fc) , 



7i(fe + l) 



It is readily checked that the transformation matrix 



T(fc) 



c T (fe) 
c T (fe + l)A(fe) 



•2 + 72W 



is the Lyapunov transformation. 

Fig.l shows the state response of this unstable plant with the initial condition, 
#i(l) = #2(1) = 1. The simulation result of the plant with the pole placement state 
feedback obtained above is in Fig. 2 with the same initial condition. 

4 State Observer 



In this section, we consider the design of the observer for the system (1),(2), that is the 
following linear time- varying system with a single input and a single output. 



x(k + 1) = A(k)x(k) + b(k)u(k) 
y(k) = c T (k)x(k) 



(34) 
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Fig. 2. Responce of the state variable (x) of pole placement feedback control 



The problem is to design the full order state observer for (34). Consider the following 
system as a candidate of the observer. 



x(k + 1) = F(k)x(k) + b(k)u(k) + h(k)y(k) 

= F(k)x(k) + b(k)u(k) + h(k)c T (k)x(k) 



(35) 



where F(k) e R nxn , and fe(fc) G .R n . Define the state estimation error e(fe) G i? n by 

e = a:(fe) - i(fc) (36) 

Then, e(fc) satisfies the following error equation. 

e(k + 1) = F{k)e(k) + (A(k) - F(k) - h(k)c T \k))x(k) (37) 

Hence, as well known, (35) is a state observer of (39) if F(k) and ft(fc) satisfy the 
following condition. 



F(k) = A{k) - h(k)c T (k) 

F(k) : arbitrarily stable matrix 



(38) 
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Then, the problem is to find h(k) such that F{k) is equivalent to some constant matrix 
F* with arbitrarily desired stable eigenvalues. To apply the pole placement technique to 
this problem, consider the pole placement control problem of the following anti-causal 
system. 

i(k - 1) = A T (k)i(k) + c(k)v(k) (39) 

where £(fc) G R n and v(k) G R 1 are the state variable and an input signal. 
The transition matrix, ^(i, j), of the anti-causal system (39) is the following. 

*(ij) = A T (i + 1) • • • A T (j - l)A T (j) (i < j) (40) 

From the property of the duality of the time varying discrete system, if the pair 
(A(k),c T (k)) is completely observable in step n, the pair (A T (k), c(k)) of the anti- 
causal system (39) is completely reachable in step n. 

Then, if the given system (34) is completely observable in step n, the system (39) 
has a state feedback 

v(k) = h T (k)£(k) (41) 

such that the anti-causal closed loop system is equivalent to the linear time-invariant 
anti-causal system with arbitrarily constant stable poles. 

This implies that for some state transformation matrix, P T (k) G R n , 

P T (k - l)-\A T (k) - c(k)h T (k))P T (k) = F* T (42) 

where, F* T is a constant matrix with arbitrarily stable eigenvalues. From this, we have 
the following equation. 

P^ik^Aik) - h(k)c T (k))P(k -1) = F* (43) 

Hence, using this h(k), the state observer for the system (34) is obtained. By applying 
the design procedure stated in the previous section, the calculation procedure for the 
observer is as follows. 

[Calculation Procedure of the Observer Gain] 

Step 1. Using the transition matrix (40), calculate the controllability matrix of the anti- 
causal system (39), Vr(&), by the following. 

V R (k) = [c(k-n+ 1), &(k -n,k-n + l)c(k - n + 2), 

••-,&{k-n,k-l)c{k)] (44) 

Step 2. Using the (17), calculate c T (k) for the anti-causal system (39) by the following. 

c T (k) = [0, ...,0, l]V^(k + n) (45) 

Step 3. Define the stable characteristic polynomial of backward shift operator, z _1 , in 
the similar way of (18). 

qiz' 1 ) = z~ n + an^z-^-V + • • • + a (46) 
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Step 4. In the similar way of (20), calculate h{k) by 



h T (k) = [ao,ai,---,a n -i,l 



c r (k) 



c T (k-l)&(k-l,k) 
c T (k -n)\P(k -n,k) 



and the observer gain vector is h(k). 

Step 5. Define the transformation matrix, P T (k), by 



p T (k) 



c r (k) 
c T (k-lp(k-l,k) 



c T (k -n+ l)W(k -n+l,k) 



(47) 



(48) 



and check if P(k) is Lyapunov transformation matrix. 

As for the linear time-invariant case, the observer based pole placement control sys- 
tem can be described by 



x(k + l) 
e(fc + l) 



A(k) - b{k)d T {k) b(k)<f(k) 

A(k) - h(k)c T (k) 



x{k) 
e(k) 



where e(k) := x(k) — x(k). Using the Lyapunov transformation matrix 

T(k) 
P(k-l) 

and, also from (28) and (43), the system matrix of (48) is equivalent to 

~A\ T~\k + l)b(k)d T (k)P(k - 1) 
0, F* 



(49) 



(50) 



(51) 



which implies that the observer based pole placement control of linear time- varying 
system satisfies the separation principle. 

Example 2 

Consider the system (30) and (31) of Example 1, again. Suppose that the state,#(fc), is 
not accessible. Then, we design the pole placement feedback using the state observer. 
In this example we show the calculation procedure to obtain the observer gain. We use 
Ai(fc), \2{k) and \$(k) as the same definitions as in Example 1. 

First, note that this plant is completely observable in step n, because the observability 
matrix is as follows. 



c T (k) 
c T (k + l)A(k) 



73 (fe) 

73 (k) 73(fc)7i(&) 



(52) 
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The system matrices of the anti-causal and dual system are as follows. 

A T (k) 
c(k) 



1 72 (k) 
7i (k) 2 







From (45), c T (fc) for the new output matrix is obtained as 



where, 



c T (k) = [ 0, 1 ][ c(k + 1), A T (k + l)c(k + 2) J" 1 



<5(fc) = 73(A: + l)73(A: + 2)7 1 (fc + 2) 



(53) 



(54) 



(55) 



The purpose is to design the state feedback for the dual system matrices, (39), so that 
the anti-causal closed loop system is equivalent to the linear time-invariant system with 
Ai = 0.2 and A2 = —0.1 as its closed loop poles. This implies that the desired closed 
loop characteristic polynomial is 

q(z) = z~ 2 - 0.1Z' 1 - 0.02. 

From (47), h T (k) is calculated as follows. 



h T (k) = [-0.02, -0.1, 1] 

= [/ii(fc)/i 2 (fc)] 
Here, h\{k) and h,2(k) are 



c T (k) 

c T (k - l)A T (k) 

c T (k - 2)A T (k - l)A T (k) 



(56) 



hi(k) 



h 2 (k) = 



Q.l73(fc)7i(fc) 

5(k - 1) 
73(fc-l)7i(fe-l) + 2 73 (fc)7i(fc) 

6(k-2) 
0.02 7 3(fc + 1) 0.2 73 (fc) 
S(k) S(k-1) 

73(fc-l)7i(fe-l)72(fe)+4 73 (fc) 
5(k-2) 



Hence, the observer gain vector is h{k). 
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Fig. 3. Response of the pole placement with the observer (x(k)) 




Fig. 4. Response of x\ (k) and x\ (k) 




Fig. 5. Response of X2 (k) and X2 (k) 



It is also readily checked that the transformation matrix 

p T (k) 



C T (k) 

c T (k - l)A T (k) 



is the Lyapunov transformation. 
From the above, the observer is 



x(k + 1) = {A(k) - h(k)g T (k)}x(k) + b(k)u(k) + h(k)y(k) 



(57) 
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and, the pole placement feedback using this observer becomes 

u(k) = -d T (k)x(k) (58) 

where d(k) is defined in (33) in Example 1. 

Fig. 3 ~ 5 show the simulation results of the pole placement feedback system with 
the observer. The initial condition of the plant is x\ (1) = X2 (1) = 1. 



5 Conclusions 

This paper proposed the simple derivation method of the pole placement state feedback 
gain for liner time- varying discrete systems. Feedback gain can be calculated directly 
from the plant parameters without the transformation of the system into any standard 
form, which makes the design procedure very simple. This technique is applied to the 
observer design procedure using the duality of the linear time- varying system. 
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Abstract. The spacecraft formation control problem sets high demands to the 
performance, especially with respect to positional accuracy The problem is fur- 
ther complicated due to scarce fuel resources and limited actuation effects, in 
addition to the many sources of disturbances. This paper addresses the problem 
of finding the optimal gains of spacecraft formation controllers. By optimal, we 
mean the gains that minimizes a cost functional which penalizes both the control 
efforts and the state deviation, while still guaranteeing stability of the closed-loop 
systems in the presence of disturbances. 

Keywords: Robustness, ISS, Moving average of disturbances, Spacecraft 
formation, gain tuning. 



1 Introduction 

Formations of spacecraft are mainly motivated by their flexibility and increased baseline 
length compared to monolithic spacecraft. They are complex systems which require 
precise control to maintain relative trajectories, even in the presence of disturbances. 
These disturbances are often only described by the statistical or averaged characteristics 
of the perturbing signals (e.g. amplitude, energy, average energy, etc.), and are due to 
for instance intervehicle interference, solar wind and radiation and gravitational effects. 
To account for the limited fuel resources, the control law should minimize the fuel 
consumption, while still guaranteeing a predefined accuracy. 

Ignoring the disturbances, an optimal control, that is the control input that minimizes 
some sensible cost function, can be derived using the Hamilton- Jacobi-Bellman equa- 
tion or Pontryagin's maximum principle. With the introduction of Hoo methods, by 
1 19], one can specify the level of plant uncertainty and the signal gain from disturbance 
inputs to error outputs |4|. Although the Hoo control problem was initially stated for 
systems described by transfer matrices, the ideas were soon translated into a state space 
setting [18], by realizing that the Hoo norm is the £2 induced norm in the time domain. 
The nonlinear equivalent of the Hoo problem was shown by 1 13 ] to be determined by 
the Hamilton- Jacobi-Isaacs (HJI) equation or inequality. An analytic solution to the HJI 
equation is general difficult to find. With the introduction of the inverse optimal control 
problem by Q, the performance index is a posteriori determined rather than a priori. 
In |[T0l this idea was applied to systems with disturbances, and it was shown that input- 
to-state stabilizability is a necessary and sufficient condition for what is known as the 
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inverse optimal gain assignment problem to be solvable. The sufficiency part is rely- 
ing on the Sontag type control law introduced in [16]. By solving the inverse optimal 
control problem solutions to a whole family of HJI equations are found. In (HJ the link 
between optimality and closed-loop stability was further investigated by relating the 
cost function to a Lyapunov function, with the purpose of an optimal selection of the 
controller design parameters. 

The problem of study in this paper is similar to the problem in [8], however we do 
not use the Lyapunov function as our cost function. Instead we introduce a new cost 
function, and solve the optimization problem with the stability constraints based on the 
Lyapunov analysis. The Lyapunov analysis of was performed in a previous paper by 
the authors, |6], and a guarantee for a hard bound on the state norm for input- to- state 
stable (ISS) systems in presence of signals with limited moving average were found. 
Furthermore, it was shown that the Lyapunov function can give an explicit estimate of 
the maximum of the disturbances' moving average that can be tolerated for a given pre- 
cision. For the sake of completeness, some of the results are repeated here. In this paper 
we address the problem of finding the optimal gains, subject to constraints imposed 
by the Lyapunov analysis to achieve the above mentioned precision and stability. We 
consider a cost functional that is quadratic in both state and control variables, and the 
optimization is performed using f mine on, because of its ability to handle nonlinear 
constraints. We emphasize that we are not doing optimal control in usual sense, as the 
the controller and observer has already been designed without necessarily optimality in 
mind. It is obvious that the performance is restricted by parameterization of the chosen 
control law. 

Notation and Terminology. A continuous function a : R> — > R>o is of class /C 
(a G /C), if it is strictly increasing and a(0) = 0. If, in addition, a(s) — > oo as s — > oo, 
then a is of class /Coo (ql G /Coo). A continuous function (3 : R>o x R>o — ► R>o is said 
to be of class ICC if, /?(•, t) G K, for any t G R>o, and /3(s, •) is decreasing and tends to 
zero as s tends to infinity. The solutions of the differential equation x = f(x,u) with 
initial condition xq G R n is denoted by #(•; xq, u). We use | • | for the Euclidean norm 
of vectors and the induced norm of matrices. The closed ball in R n of radius S > 
centered at the origin is denoted by Bs, i.e. Bs := {x G R n : \x\ < 5}. | • \ s denotes 
the distance to the ball Bs, that is \x\ 6 := mf ze s s \x — z\.U denotes the set of all 
measurable locally essentially bounded signals u : R>o — > R p . For a signal u G U, 
IMIoo •= ess sup t> o|^(£)|. The maximum and minimum eigenvalues of a symmetric 
matrix A are denoted by A max (74) and A m i n (A), respectively. I n and n denote the 
identity and zero matrices of R n x n respectively. We use E for the expectancy operator. 

2 ISS Systems and Signals with Low Moving Average 

We start by recalling some classical definitions related to the stability and robustness of 
nonlinear systems of the form 

x = f(x,u), (1) 

where x G R n , u G U and / : R n x W — > R n is locally Lipschitz and satisfies 
/(0,0) = 0. 
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Definition 1. Let 5 be a nonnegative constant and W C U. The ball Bs is said to 
be globally exponentially stable (GES) for ([7]) with respect to W if there exists some 
positive constants k\ and &2 such that the solution offil} satisfies 

\x(t',x ,u)\ <£ + /ci|£ |e- fe2t , Vt>0. (2) 

for all x E M n and all u 6 W. 

We next recall the definition of ISS, originally introduced in fl31 . 

Definition 2. The system x = /(x, u) is said to be input- to- state stable (ISS) if there 
exist j3 G ICC and 7 G /Coo such that, for all xo G W 1 and all u eU, the solution offil} 
satisfies 

\x(t;x ,u)\ < 0(\x o \,t) +7(lklloo) , Vt > 0. (3) 

ISS thus imposes an asymptotic decay of the norm of the state up to a function of the 
amplitude \ \ u \ | oo of the input signal. 

We also recall the following well-known Lyapunov characterization of ISS, origi- 
nally established in [12] and thus extending the original characterization proposed by 
SontaginlfTTl. 

Proposition 1. The system (0) is ISS if and only if there exist a, a, 7 G /Coo and n > 
such that, for all x G IR n and all u G W, 



a(\x\) <V(x) <a(\x\) (4) 

dx 

7 is then called a supply rate for ([7]). 



^(x)f(x,u) < - K y(x)+'y(\u\). (5) 



The input signals we consider in this paper are slightly more restrictive than those in for 
instance 1 1 ], but the advantage is that a hard bound on the state norm can be guaranteed. 
Namely, we consider input signals with bounded moving average. 

Definition 3. Given some constants E, T > and some function 7 G /C, the set 
W 7 (E, T) denotes the set of all signals u eU satisfying 



rt+T 

/ <y(\u(s)\)ds <e, vte: 



The main concern here is the measure E of the maximum energy that can be fed into 
the system over a moving time window of given length T. These quantities are the 
only information on the disturbances that are taken into account in the control design. 
More parsimonious control laws than those based on the disturbances' amplitude or 
energy can therefore be expected. We stress that signals of this class are not necessarily 
globally essentially bounded, nor are they required to have a finite energy. Robustness 
to this class of signals thus constitutes an extension of the typical properties of ISS 
systems. 

With input signals with bounded moving average, the following result of l6l 
guarantees global exponential stability of some neighborhood of the origin. 
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Corollary 1. Assume there exists a continuously differentiable function V : R n — > 
R>o, class /Coo function 7, functions a (s) = cs p ^^J a (s) = cs p , w/£/z c, c, p positive 
constants, and a positive constant n such that (0) and @ /zo/d for all x G M n aftd 
a// u G M p . TTz^ft, g/v^ft a^ry T, J > 0, the ball Bs is GES for ([7) with any signal 

u G yUy (£■, T) provided that 

„_ c-n ^ e* T - 1 
E(T,(J)< 



2 2e^ T - 1 " 

3 Spacecraft Formation Control 

The results of Section [2] were in [ 6 ] exploited to demonstrate robustness of a spacecraft 
formation control in a leader- follower configuration. We will here only include the parts 
necessary to keep this section self-contained. For further details, the reader is referred 
to the original paper. 

3.1 Spacecraft Models 

The model for the leader spacecrafts motion with respect to a moving coordinate frame 
is given by 

P + C(v )p + D(v ,v )p + n(r ,p) = F t (6) 

while the model for follower spacecraft with respect to the leader is given by 

9 + C {po) p + D (z> , f> ) p + n (r + p, p) = F f - F t , (7) 

where Fi := (ui + di)/mi, Ff := (uf + df)/rrif 9 and where subscripts I and / stand for 
the leader and follower spacecraft respectively, p is the position of the leader spacecraft 
with respect to a coordinate frame in a Keplerian orbit, where as p is the position of the 
follower spacecraft with respect to the leader, mi and m/ are the spacecraft' masses, u\ 
and Uf are the control inputs, and d\ and df denote all exogenous perturbations acting 
on the spacecraft. Furthermore, 



"0 


-10" 


1 












C(z> ):=2z> C, C:~- 

D (z> , v ) := i%D + v C, D := diag(-l, -1, 0), 

and 

/ x { r +p r \ 

\\r +p\ 6 \r \ 6 J 

with r being the origin of the coordinate reference frame, and v Q the true anomaly. We 
make the following assumptions on the first and second derivatives of v Q \ 

Assumption 1. The true anomaly rate v Q and true anomaly rate -of- change i) of the 
reference frame satisfy ||£ ||oo < A> and ||#o||oo < ft v , for some positive constants 
(3y o and /3j) o . 
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3.2 Controller and Observer Design 

In (6) the proposed controller of the leader spacecraft were: 

ui =mi p d + C (v Q ) Pd + D (v ,i) ) p + n (r ,p) - h (p - p r ) (8) 

with pd being the reference trajectory, e\ = p — pd the tracking error, p = p — p the 
estimation error, p r = pd — lieu Po = P — lip, and with velocity estimate provided by 

p=ai + (h+£i)p (9) 

ai=pd + hliP- (10) 

The controller for the follower spacecraft has a similar structure, given by 



u f =m f 



Pd + Pd + C (z> ) (pd + Pd)+D (z> , v ) (p + p) + n (r + p, p) + n (r ,p) 



*J (PO -Pr) - */(Po -Pr) 



(11) 



with p being the reference trajectory, e/ = p — p^ the follower spacecraft tracking error, 
p = p — p the estimation error, auxiliary signals p r = Pd~ lf&f and po = p — lfp, and 
with the observer 



p=a f + (l f +l f )p 
a/ =pd + //^/P 



(12) 
(13) 



The parameters ku h, lu kf, If and If denote some positive tuning gains. The closed- 
loop system may now be summarized into the equations 



X = A(i> o (t),0)X + Bd, 



(14) 



where X := (X^XJ) T withXi := (ej , ej ,p T ,p T ) T andX f := (ej, ej, p T , ^ T ) T , 
d:= (df,d]) T ,6 := (k h l h l h k f ,l f , l f ) T and A := blckdiag(A z , A f ) with 



A.K) := 



"0 3 J 3 o 3 o 3 ' 

Q>21 Cl22(i / o) CL23 «24 

3 3 3 I 3 

a 4,\ d^lipo) &43 &44 



(15) 



for i G {/,/}. Out of notational compactness, the following matrices have been used: 
021 •= «4i := -hlils, a 2 2 := «42 := -C(v ) ~ hh, «23 := Mi^3, «24 := &i^3 , 
a 43 := (fci - h)lih and a 44 := (^ — ^ — li)h- Finally, B := (Bj ,Bj) T with 



mi 



"0 3 3 " 


/303 


0303 


/303 



and 5/ := 



VfliVflf 



3 


O3 " 


m// 3 


TOi-^3 


3 


3 


mjh 


mih_ 
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3.3 Robustness of the Overall Formation 

The following result establishes robustness of the controlled formation to a wide class 
of disturbances: 

Proposition 2. Let Assumption \7\ hold. Let the controller of the leader spacecraft be 
given by (I51)-rf70l) and the controller of the follower spacecraft be given by (f771) -([75l) 
with, for each i G {/, /}, k > 2ki, hi > 2k*, where 

■, ^ _ \ "i H~ Pi if ki ~ t-i S: ki*~i 

1 : " \M 

with 

0i := Pu \/2ff \ 1 f I 1 + -i I ^-^ . (16) 




Given any precision 5 > and any time window T > 0, consider any average energy 
satisfying 



<5-T "■!" I'" T,\/i'1 ! 1 ^1}S 2 ^7^7- (17, 



where 



mm ie{u} k*J m&x ie{u} \^\ 

I -J (18) 



max ie{IJ} \e % + iy4£fTT+ ^} ' 

Then, for any d G W 7 (£^, T) w/z^r^ 7(5) := s 2 , the ball Bs is GES for the overall 
formation summarized by H4i . 

The proof of the proposition is found in [6). 

As already stressed, the results recalled in Section [2] allow to expect more parsimo- 
nious solicitation of the actuators than classical ISS-based reasonings. However, such 
an improvement will only be made practical if the gains of Proposition [2| are tuned in 
an adequate manner. The rest of the document focuses on this issue. 



4 Optimal Controller and Observer Gain Tuning 

Of simplicity we will in the following consider the unperturbed version of (TBI) . 

X = A(v o (t),0)X, (19) 

since d is only characterized by its moving average and therefore difficult to incorporate 
in the optimization problem we will set forth in the sequel. One possibility to incorpo- 
rate the disturbance into the optimization problem would be to look at the worst case 
scenario, which is typical of differential game problems or the inverse optimal gain as- 
signment problem mentioned in the introduction. The worst case excitation due to the 
disturbances with limited moving average, can for a linear system be found using the 
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results of Q. We however, aim at finding the parameters 0, which are the best com- 
promise between fast state convergence and limited control efforts, while ignoring the 
effect of the disturbance. If the problem is posed as an optimization problem, a com- 
mon choice is to use a cost function which quadratically penalizes control and state 
deviation, i.e. a cost function of the form 



J(0) 



rth 



QX(t)+u(t) 1 Ru(r)dr 



(20) 



with weighting matrices Q G M 24x24 and R G M 6x6 both being symmetric and positive 



definite, optimization horizon th and u = (uj , uJ) T , where 



ui = mi 



D(i/ ,i> )ei -ki(p - p r ) 



and 



u f = m f 



D (z> , i> ) (ei + e f ) - k t (p - Pr) ~ kf (p - Pr) 



(21) 



(22) 



are the terms of ® and (fTTT) , respectively, that we want to penalize. It can be noticed, 
that we have chosen not to penalize the feed-forward terms from the reference and the 
gravity compensation in ([8]) and (ITTb . The cost d20l) , can be written as: 



J{0) = [ H X t (t)(Q + (T + H) T i?(T + E))X(r)dT 

^^0 



(23) 



with 

T:=m f 

and 



-Mih *hhh 



Mih 



:hh 3 3 3 



o 3 



n f • - m f m f • - m f 

hlih hh ~hhh ~hh kf£ f I 3 k f I 3 -k f £ f I 3 -k f I 3 



miD(P , i) ) 3 

3 m f D(v , v ) 



h o 3 3 3 3 3 3 3 " 
h o 3 3 3 h o 3 o 3 3 



(24) 
(25) 



We now introduce the state transition matrix (£>{t, t ) which is the unique solution of 

d 



dt 



&(t,to) = A(0 (t),6)$(t,to) : 



with initial condition 
The solution to ( fT9t is 



<P(t ,t ) = hi- 



and is used to rewrite the cost function into the following form 

J(0) = Xj f " # T (r, to)(Q + (T + S(t)) t R(T + 3(t)))#(t, t )dTX . 

Jto 



(26) 

(27) 
(28) 

(29) 
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We will avoid dependency of the cost function on initial conditions, and will assume that 
Xo are randomized variables with zero mean, and covariance matrix equal to identity, 
that is E{X Xj} = I 2 a- By defining 

P(t h , to) := / " ^ T (r, t )(Q + (T + S(r)) T R(T + S(t)))*(t, t )dr (30) 

J t 

and using the fact that 

E{XjP(t h ,t )X } = E{tmceXj P(t h ,t )X } 
= E{traceX X T P(^,to)} 
= trace E{X X T }P(^, t ) 
= trace P{t h , t ) , 

our optimization problem can be formulated as 

minE{ J(0)} = min trace P(t h , t ) (31) 

o o 

subject to 

—$(t,t )=A(i> (t),0)$(t,to), (32) 

with $(t , t ) =/ 2 4, 

^P(t, t ) = <2> T (£, t )(Q + (T + ~ (£)) T )it>(T + S(t)))*(t, to) , (33) 

with P(t , to) = (Q + (T + S(t )) T R(T + S(t ))), for each z € {/, /}, 

Z* > 2fei (34) 

fci > 2/c* (35) 



where 



with 



and 



/c^ := < ~ (3d) 

IA/s otherwise, 



ft := A> A / 2^ + 1 + i 1 + -^ I ^V , (37) 

mf I mf 



/?2 I i I i j L 



k h £ h k f J f e[0,2] (38) 

h e [0, 10] (39) 

If e [0, 15] . (40) 
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The constraints d38t-(l40l) are introduced to more efficiently solve the problem. The 
problem was solved using the interior point algorithm of the function fmincon in 
MATLAB. 

The true anomaly, v Q , of the reference frame can be obtained by numerical integra- 
tion of the equation 

y/JI(l + e cos^ (t)) 2 



Vo(t) 



(|(ra + r p )(l-c2)) 



3/2 ' 



(41) 



where radius of perigee and apogee is chosen as r p = 10 7 m and r a = 3 x 10 7 ra, 
respectively. The true anomaly rate and rate of change are calculated using flTt and 



i>o(t) 



-2/ie (1 + e cos^ (t)) sin u {t) 
{\(r p + r a )(l-el)f 



(42) 



The eccentricity can be calculated from r a and r p to be e = 0.5, and we see from 
(l42k that the constant fi$ o in Assumption [T] can be chosen as /3j) = 4 x 10 -7 . The 
constant (3„ o in Assumption Q] can be chosen as (3„ o = 8 x 10 -4 as seen from (|4T1) . 
We have assumed that the reference frame is initially at perigee, that is v Q (to) = and 
v (to) = v p /r p , where 



v p = 4 2/i — 



r P (r p + r a ) 



(43) 



The horizon of the problem was chosen as th = 30, and the optimization problem 
was solved using R = I24 and Q with ones along the diagonal except that Q(2, 2) = 
Q(3, 3) = Q(14, 14) = (5(15, 15) = 20. It can be argued that choosing the elements of 
R and Q is as hard as finding the parameters in the original problem. However, with 
the common choice of diagonal Q and R, choosing Q and R is more intuitive in the 
authors opinion. The optimization problem, was solved with all combinations of initial 
conditions 6i G {0, 1, 2}, and the results are summarized in Table [TJ The reason for 

Table 1. Optimization result of 729 iterations, covering all combinations of initial condi- 
tions 0i G {0,1,2}. By Best value, we mean the parameters which gives the lowest cost 
function value. 



Parameter 


ki 


£1 


h 


kf 


*/ 


h 


Best value 
Mean 


0.3382 
0.3280 


0.2658 
0.2809 


2.0048 
1.6541 


0.3738 
0.3855 


0.3302 
0.3154 


1.7644 
1.8660 



solving the optimization problem with several different initial values, is to find, if not a 
global minimum, then at least a good local minimum. Looking at mean values in Table 
Q] we see that the parameters l\ and If are most affected by different initial guesses for 
the parameters. 

The parameters giving the lowest cost function were then used as initial guess, and 
the problem was solved with different values for the horizon th- The results in Table [2] 
suggests that the found parameters represent a good local minimum. 
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Table 2. Optimization result with different values of the horizon length th 



Horizon 


ki 


£i 


h 


kf 


if 


If 


20 


0.3195 


0.2743 


1.5802 


0.3829 


0.3200 


1.8948 


50 


0.3382 


0.2658 


2.0048 


0.3738 


0.3302 


1.7644 


100 


0.3382 


0.2658 


2.0048 


0.3738 


0.3302 


1.7644 


300 


0.3415 


0.3089 


1.9524 


0.3631 


0.3245 


1.7608 


1000 


0.3382 


0.2658 


2.0048 


0.3738 


0.3302 


1.7644 



4.1 Simulations 

We will now provide simulations of the spacecraft formation, using the gains achieved 
from the optimization procedure in the previous section. For simplicity, we choose 
the desired trajectory of the leader spacecraft to coincide with the reference orbit, i.e. 
Pd( m ) = (0, 0, 0) T . The reference orbit is generated by numerical integration of 



p 



i3'°' 



(44) 



with r (0) = (r p , 0, 0) and r (0) = (0, v p , 0), with v p as in d43t . Initial values of the 
leader spacecraft are p (0) = (2, -2, 3) T and p (0) = (0.4, -0.8, -0.2) T . The initial 
values of the observer are chosen as p (0) = (0,0, 0) and a\ (0) = (0, 0, 0) . 

The reference trajectory of the follower spacecraft are chosen as the solutions of a 
special case of the Clohessy-Wiltshire equations, cf. [3]. We use 



Pd (t) = 



10 cos v Q (i) 

-20sinu o (t) 





(45) 



This choice imposes that the two spacecraft evolve in the same orbital plane, and that 
the follower spacecraft makes a full rotation about the leader spacecraft at each orbit 
around the Earth. The initial values of the follower spacecraft are p (0) = (9, — 1, 2) T 
and p (0) = (— 0.3, 0.2,0. 6) T . The initial parameters of the observer are chosen to be 
p (0) = p d (0) = (10, 0, 0) T and a f (0) = (0, 0, 0) T . We use m f = m t = 25 kg both 
in the model and the control structure. 

With as the best value in TableQ] we find from dHJ) that n = 0.0901. Over a 10 
second interval (i.e. T=10), the average excitation must satisfy E(T, S) < 0.0061 5 2 , 
according to (fTD) . We consider two types of disturbances acting on the spacecraft: 
"impacts" and continuous disturbances. The "impacts" have random amplitude, but 
with maximum of 1.5 N in each direction of the Cartesian frame. For simplicity, we 
assume that at most one impact can occur over each 10 second interval, and we as- 
sume that the duration of each impact is at most 0.1s. The continuous disturbances are 
taken as sinusoids, also acting in each direction of the Cartesian frame, and are chosen 
to be (0.1 sinO.Olt, 0.25 sin0.03t, 0.3 sin0.04t) T for both spacecraft. Notice from © 
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Fig. 1. Leader spacecraft 

that the relative dynamics are influenced by disturbances acting on the leader and fol- 
lower spacecraft, so the effect of the continuous part of the disturbance on the relative 
dynamics is zero. It can easily be shown that the disturbances satisfy the following: 



rt+10 

I 



\d(r)\ 2 dr < 1.42, Vt > . 



Figure |l(a)[ |l(b)| and |l(c)| show the position tracking error, position estimation error 
and control history of the leader spacecraft, whereas Figure [2(a)! |2(b)| and |2(c)| are the 
equivalent figures for the follower spacecraft. Figure [T(d)| and [2(d)] show the effect of 
d\ and df — d\ acting on the formation. Notice in Figure |2(d)| that the effect of the 
continuous part of the disturbance is canceled out (since we consider relative dynam- 
ics and both spacecraft are influenced by the same continuous disturbance), whereas 
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Fig. 2. Follower spacecraft 



the effect of the impacts has increased compared to the effect of the impacts on the 
leader spacecraft. Since E = 1.42, and should satisfy E < 0.0061 J 2 , this gives a 
very large S. As can be seen from Figure |l(a)[ |l(b)[ |2(a)[ |2(b)| the actual precision 
reached, is much better than the theoretical expectations. The reason for this is that 
the constraints on the control gains are based on Lyapunov analysis, which in general 
yields very conservative results, and also conservative estimates of the disturbances the 
control system is able to handle. However, we stress that the constraints on the gains 
based on moving average of the disturbance are much more relaxed than those obtained 
through a classical ISS approach, i.e. relying on the disturbance magnitude. Figure [T(c)| 
and Figure [2(c)! shows that the conservative estimates results in large transients in the 
actuation, but by finding the optimal parameters, the control efforts are heavily reduced 
compared to [6], while the tracking errors are kept at a reasonable level. 
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Abstract. The performance of Wireless Sensor Networks (WSNs) during the 
tracking of dynamic targets is addressed in this paper. The strategy outlined in 
this paper uses a Distributed implementation of a Kalman Filter to track 
dynamic targets. In contrast to the results reported in the literature, the 
approach in this paper has the Kalman Filter running on only one network node 
at any given time. The knowledge learned by this node, i.e. the system state and 
the covariance matrix, is passed on to the subsequent node running the filter. 
Since a finite subset of the sensor nodes is active at any given time, target 
tracking can be accomplished using lower power compared to centralized 
implementations of the Kalman Filter. The tracking problem in WSNs is 
formulated mathematically and the stability and tracking error of the proposed 
strategy is rigorously analyzed. Numerical simulations are then used to 
demonstrate the utility of the proposed technique. The results in this paper show 
that the proposed technique for target tracking will result in significant savings 
in power consumption and will extend the useful life of the WSN. 

Keywords: Distributed kalman filter, Wireless sensor networks, Target 
tracking. 



1 Introduction 

Surveillance of remote inaccessible areas and the detection and tracking of intruders 
are some of the important applications of Wireless Sensor Networks (WSNs). 
Research in WSNs has addressed several important issues in optimal deployment, 
coverage, routing, and energy efficiency of the WSNs [1, 2, 3, 4, 5, 6] Diffusion and 
directed diffusion approaches have been proposed to address coverage, routing, 
discovering, and sensing fusion issues in WSNs [7]. The application of WSNs in 
surveillance and monitoring of target areas have also been widely researched [8]. 
While the results presented in these papers are encouraging, their applicability in low 
cost WSNs with large measurement noise and faulty measurements is fraught with 
problems. In recent years, Kalman Filters have been proposed to address the 
uncertainty and the noise in the measurements [9, 10, 11, 12, 13, 14, 15]. The 
convergence analysis of extened Kalman Filters also proposed in [16, 17]. Both 
centralized and distributed implementation of the Kalman Filter was proposed to 

J.A. Cetto et al. (Eds.): Informatics in Control, Automation and Robotics, LNEE 89, pp. 349- |362l 
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make their use suitable to WSN applications. However, these techniques are still 
power intensive and require significant amounts of onboard power for communication 
and computation. 

Two classes of Kalman filtering approaches have been implemented in WSNs. The 
first approach is centralized Kalman Filters [15] where every sensor node takes 
measurements and communicates with the other nodes while simultaneously 
computing its own version of Kalman Filter. In this approach, the sensor nodes' 
power will be depleted quickly because of excessive measurements and inter-node 
communication. Moreover, it is sometimes impractical for a sensor node to 
communicate with all the other nodes due to limitation of communication ranges. The 
second method is distributed Kalman Filters [9, 13, 14] where every neighbor node 
runs its own version of the Kalman Filter and shares the information with all other 
neighbors to reach the consensus of the system. The approaches above are distributed 
in processing. The number of neighbor nodes determines how expensive the 
algorithms are in terms of power consumption and communication complexity. 
Consequently, these approaches are not efficient because they require extensive inter- 
communication among neighbor nodes. 

In this paper, the distributed Kalman Filter is proposed to estimate the position of 
the target. The approach is different from the above work in the sense that the Kalman 
Filter is implemented in a distributed fashion across the WSNs. At a given instant, 
only one master node runs the Kalman Filter using the measurement inputs from its 
neighbors and shares the estimated knowledge with the subsequent master node. The 
neighbors within a certain distance from the target measure the distance to the target, 
and transmit measurements to the master node. On one hand, the procedure 
significantly reduces the communication costs among the neighbor nodes in 
comparison with the algorithms reported elsewhere in the literature. On the other 
hand, since the master node alone executes the Kalman Filter and the neighbor nodes 
only perform measurement functions, the complexity of the WSN is greatly reduced. 
This results in reduced communication costs in the entire sensor network and 
complexity of the tracking algorithm. Consequently, the performance of the 
distributed Kalman Filter is as good as that of the centralized Kalman Filter. 

This approach is validated through mathematical analysis and simulation examples. 
The algorithm was also able to track the target with random directions with acceptable 
estimated results. The estimation results showed that the model is robust to 
measurement noise and the change in velocity. The estimated knowledge of the 
Kalman Filter including system state and co variance matrix is passed directly to the 
subsequent master node where the Kalman Filter is run. Another aspect of the 
proposed algorithm is that the master node determines the direction and velocity of 
the intruder and wakes up appropriate sensor nodes in the direction of the target 
travel. Thus, nodes further away from the target are inactive and only a small subset 
of the nodes participates in the sensing. Prior to the start of the tracking, the 
knowledge of the maximum target velocity can be used, the boundary nodes of the 
sensor field are activated in round robin fashion discussed in [6] to save energy. 

The rest of the paper is organized as follows: Section 2 is the problem formulation. 
Section 3 is the convergence analysis of the distributed Kalman Filter. Section 4 is 
discussion. In section 5, we show the numerical simulation. Section 6 and 7 is 
conclusion and appendix. 
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2 Problem Formulation 

A closed and bounded sensor field in three dimensional (3D) Cartesian coordinate 
system is densely deployed with stationary sensor nodes. There is an intruder entering 
the sensor field with unknown nonlinear trajectory, the problem is how to the sensor 
network tracks the target correctly. It is assumed that each node has omnidirectional 
sensing capability to measure the distance between the target and itself. 
At time k, the tracking system is governed the following equations 



*k+i = f(x k ) + w k 
y k+1 = h(x k+1 ) + n k+1 



(1) 
(2) 



x k - [ v k>Pk] T is the state of the target at time k where v k E M 3 and p k E M 3 are the 
velocity and position respectively in 3D coordinate system. y k E M 3 is the measured 
position of the target which is calculated by trilateration algorithm (3). w k E U 6 and 
n k E M 3 are Gaussian distributed process noise with covariance matrix Q and 
measurement noise with covariance matrix N respectively. Q and N are assumed to be 
symmectric and positive semidefinite. f(x k ) and h(x k ) are assumed to be continuous 
with respect to time and differentiable with respect to x k . Suppose that the maximum 
velocity of tracking targets is known. Thus, the domain V E U 6 of f(x k ) in (1) is a 
compact and connected set. 

The target, represented by * symbol shown in Fig. 1, is moving in the direction of 
vector OE. The region R is defined by the sphere of radius R l9 the radius of R 2 and 
angle 2a - the region limited by the bold line. R 2 ,R 1 , and R a (R 2 > R t > R a ) are 
activation radius, sensing radius, and measurement radius respectively. Fig. 1 is the 
projection of the region R onto a plane that is parallel to the moving direction of the 
target. All the sensor nodes inside the region of activation R are activated, while the 













Direction target 
is heading to 

E 



Fig. 1. The target represented by * at point O. The boundary of the region of activation R is 
limited by line AB, curve BC, line CD and curve CA (the bold line above). The curve BC is 
formed by part of the sphere(0, R 2 ). No nodes outside sphere (0, /? x ) can sense the target. All 
the nodes inside R are activated. However, only the sensors inside the sphere (0,R a ) are 
actively taking measurement. 
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nodes outside the region are in sleep mode to save power. All the nodes inside the 
sphere (0, #1) can sense the target while no node outside can detect the target. 
However, only nodes inside the sphere (0, R a ) are actively taking measurements and 
reporting the data to the master node. 

A master node is selected depending on two criteria: the distance to the target and 
residual power. The sensors inside the sphere with radius R a take measurements and 
transfer the measured data to the master node. The master node runs the extended 
Kalman Filter for the system (1) and (2) in the distributed sense and obtains the 
estimated position and the direction of the target. The master node broadcasts the 
learned knowledge of the target to its neighbors. After receiving the information, a 
node will turn on or off depending on whether it is inside or outside region R. 

2.1 Position Calculation 

Suppose that at time k there are n sensor nodes that are actively taking measurements 
whose coordinates are (x kl , y kl , z kl ), (x k2> y k2 , z k2 ),.. (x knt y kn> z kn ), and 
measured distances from each nodes to the target are d lt d 2 , • •d n respectively. 
These measurements are sent to the master node. Consequently, master nodes run the 
trilateration algorithm using least squares method. The solution of the target's 
coordinate y k = (x kt , y kt , z kt ) in (2) is given by 



yu = 



x kt 

y^ 

z kt 



= (A T A)~ 1 Ab 



(3) 



where A and b are in the following forms 



20k2 - x kl ) 2{y k2 - y kl ) 2{z k2 - z kl ) 
2 (x k3 ~x k2 ) 2 (y k3 -y k2 ) 2 (z k3 - z k2 ) 

2( x kl — Xkn) 2 (ykl ~ ykn) 2( z kl ~ z kn) 



B = 



(d\ - d\) + (x\ + y\ + z\) - (x\ + y\ + z\) 
{d\ - d\) + {xl + y\ + z§) - [x\ + y\ + z\) 

(dl - d\) + (x\ +y\+ z\) - (x 2 n + yl + zl) 



Since the measurement y k in (3) is the position of the target, the measurement 
equation (2) becomes linear equation 



h(x k+1 ) = Hx k+1 where H 



10 
10 
1 



(4) 
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2.2 Distributed Kalman Filter 

The distributed Kalman filter of the system (1), (2) and (4) is run at only a given 
master node. 



x k+i = x k+l\k + ^/c+i(y/c+i — Hx k+1 \ k ) 
Kr+i = ?k+i\kH (HP k+1 \ k H + N)~ 
?k+i = U ~ Kk+lH)Pk+l\k 

*k+l\k = f($k) 

Pk + i\k = FP k F T + Q 



Fv = 



Af(Xk) 

dx k 



\x k =x k 



(5) 
(6) 
(7) 

(8) 
(9) 

(10) 



Equations (5)-(7) reflect measurement update, and equations (9)-(10) reflect time 
update. F is the value of Jacobian matrix of function / in (1) at time k. The initial 
value P of P k in (9) is symmetric positive semidefinite matrix. If the sampling time is 
fixed At, then F k in (10) a constant matrix. 



F = 



dx k 



10 

10 

10 

At 1 

At 1 

At 1 



(11) 



3 Performance Analysis 

3.1 Assumptions 

Before doing the stability analysis, the following assumptions are made. 

(i) Function f(x k ) in (1) is two time differentiable with respect to x k . 

(ii) The state error is Gaussian distributed and the covariance matrix Q is uniformly 
bounded. 

(iii) The range measurement error is Gaussian distributed. 

(iv) The sensor nodes are deployed dense enough that the joint measurement error is 
close to Gaussian with zero mean and the joint measurement covariance matrix 
N is uniformly bounded. 

(v) The sampling frequency is high enough. 

(vi) There is no delay when there is changing in master nodes. 

(vii) Every measured sensor node is within one hop to the target 
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3.2 Stability Analysis 

The state of equation (1) can be linearized as follows. 

x k+1 = Fx k + AF k x k + w k = a k Fx k + w k 



(12) 



Where AF k and a k are the diagonal matrix represented the nonlinear terms. If the 
system (1) is linear, AF k = and a k is equal to identity matrix. From (3), the 
equation (2) becomes 

Vk+i = h(x k +i) = H *k+i + n k 



The Lyapunov function candidate is chosen as 

Vk+i = ^k+i^k+i^k+i where x k+1 = x k+1 — x k+1 
From (20) we have 

*k+l\k = a kFXk an d e k+l = y/c+1 — ^ x k+l\k = Hx k+1 \ k 

Then, from (23)-(27) and assumption (v) we have 

Vk+i = \Xk+i\k ~ Pk+iHN~ e k+1 ) P k+1 (x k+1 \ k — P k+1 HN~ e k+1 ) 
= Xk+i\k P k+iXk+i\k - e k+i[N~ H P k+1 HN~ -2N~ ]e k+1 

From (6), (7) and (9) we have P k ^ = P k * Mk + //AT 1 // 7 
Vk = Vk + i ~V k = x T k [F T a T k (FP k F T + Q^^F - P^^e^N-^HP^^^N^e^ 

= (Fx k ) T [a T k (FP k F T + 0-V - {FP k F T y^Fx k + e T k+1 N~\HP k+1 H T - N)N^e k+1 
For any vector m we have 

m T [a T k (FP k F T + Qy^a k - (FP^y^m 



(13) 
(14) 

(15) 
(16) 



Thus, if 



^max( a k) ^ 



Then AV k < 0. 

On the other hand, the matrix 



< * ma A(FP k F T + QT^Wa^WmW 2 - X m in{{FP k F T y^\\m\\ 

HP k+1 H T - N < and 
KinWPk? 7 )- 1 ) 



*maA{FPkF T + QY 



(17) 
(18) 



k = 



r h 

HF 
HF n- 



where Ihf\ = 



oo oi o On 

10 

oooooi 

At 1 
At 1 
At 1 
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Clearly O k is full rank then F k and H k are observable. By LaSalle lamma, we conclude 
that 

lim x k = (19) 

k^co 



4 Discussion 

The range measurements using in (10) normally have noise. If the noise statistic is 
assumed to be Gaussian, then the jointly noise distribution in (2) or in (9) is neither 
Gaussian nor zero mean in general. In this subsection, it is stated that if the sensors 
are deployed dense enough, the jointly measurement error statistic v k (2) is bounded 
and closed to the Gaussian distribution. The jointly distribution depends not only on 
the range measurement noise statistic but also on the geometry of the sensor nodes 
and the target. 

Suppose that the maximum sensing radius of a sensor node is R s and the 
uncertainty in range measurement error is u. It means that the measurement error is 
smaller than u with a specified probability of p. 

If the target is covered by only one sensor, the target is in the shading donut shape 
(Fig. 2a) with the probability of p. The maximum uncertainty of the target is R s + u 
in this case. If the target is covered by two sensor, then the target is in either two 
shading area in Fig. 2b. The maximum uncertainty is 



•M-" 



(20) 



where d = 5 X 5 2 < 2R S is the distance between two sensor nodes. The probability the 
target is either in open of the two shading region in Fig. 3b is p 2 the following theorem 
yields a bound on the area of each shading region. 





Fig. 2. The uncertainty in range measurement. The target is covered by one sensor (a) and two 
sensors (b). 



Theorem i. Suppose that the target T is covered by 2 sensor nodes Si and S 2 , and the 
angle at T formed by 2 vertices Si, and S 2 is 5 1 75 2 =oc The maximum uncertainty of 
the target in one region is (one shading region Fig. 2b) 
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/ < min * 



i2V2u 



sin- 



oc 



; R + u >, 



(21) 



The first term in (22) can be easily verified by fundumental geometry, and the the 
second term is the maxium untertainty of a target in Fig. 2a. 





Fig. 3. The uncertainty in range measurement. The target is covered by one sensor (a) and two 
sensors (b). 



In Fig. 3a, as R goes to oo, the upper bond of uncertainty approaches 

2u 



I 



sin- 



ai 



(22) 



In short, the jointly measurement uncertainty depends on both sensor measured 
uncertainty u and the angle a. Thus, if given u is fixed, sensor nodes can be deployed 
densely enough such that sin- is within a certain bound. Moreover, the jointly 
measurement noise distribution is close to Gaussian. Hence, the measurement noise 
covariance matrix N is uniformly bounded. 

The choice of the cluster head is determined by the residual power (P res iduai) °f 
each node and its distance to the target. At each instant, every active node in the 
proximity of the target computes the weighted sum of its residual power and its 
distance to the target (D) as following W node = aD + pP re siduai with constants a and 
(3 in the interval [0, 1]. A node will become the new master node if its weighted sum 
is smaller than that of the current master node. Consequently, the knowledge of the 
Kalman filtering is transferred from the current master node to the new one. 



5 Numerical Examples 

5.1 Power Consumption 

The transmitted power P Tx , received power P Rx , idle power P t and sleeping power P s 
are 1400 mW, 1000 mW, 830 mW, and 130 mW respectively based on the power 
consumption analysis in [6]. From region R, the number of sensor nodes inside the 
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sphere radius R a isn a . n t is the number of sensor nodes outside the sphere with 
radius of R a , but inside the region R. The number of sensor nodes in the sensor field 
and number of active sensor nodes in the boundary are n and n b respectively. The 
total power consumption of the sensor field in one sampling cycle is calculated as 
following. 

The n a neighbors make n a transmissions and the master node receives n a times. 

Pmeas = ^ a (P Tx + P Rx ) (23) 

The master node broadcasts the target position and its directions, and it makes one 
transmission. Each of (n a + n{) neighbors in the cone area receives the information 
of the target once. 

^broadcast = ( n a + n i)PRx + ?Tx (24) 

Each active node, except measurement nodes, consumes an amount of the idle energy 

Pidie = (n b + n t )P t (25) 

The other nodes are sleeping, and the total power consumed by these nodes is 

Psieep = (n-n a -n i - n b )P s (26) 

Then total consumed power is 

mv — "raeas "■" * broadcast "■" "idle "■" "sleep \^') 

We will consider two scenarios to demonstrate the distributed Kalman Filter for target 
tracking. In the first, it is assumed that sensor nodes are uniformly disktributed. This 
requirement is relaxed in the second scenario where the nodes are randomly deployed. 
It is assumed that there is no hole in coverage within the regions to be monitored, and 
every point is covered by at least three sensors. The sensor field is assumed to be a 
square of the dimension 10x10 units as seen in Fig. 2. By choosing the distance of 
any two closest nodes is 0.5 units, the total number of uniformly distributed sensor 
nodes is 441. The target is assumed to move along the horizontal trajectory with the 
sinusoid velocity profile while the vertical coordinate remains at y = 5. In 10 seconds, 
the target travels between the coordinates (0, 5) and (10, 5). The sampling frequency 
is 200Hz and the simulation time is 10 seconds. The following difference equations 
are used to model the dynamic behaviors of the moving target. 

x k+l = F x k + w k 

z k = Hx k + v k (28) 

Where F = [^ °], x k = Q, H = [0 1] 

x k is the target velocity and and p k is target position in x the direction at time k. At is 
the sampling time. Moreover, w k and v k are Gaussian distributed with zero mean state 
noise and measurement noise. From scenario 1 to scenario 4, the initial condition for 
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the Kalman Filter is the same as the true value while it is nonzero in scenario 5. The 
sensor nodes are uniformly deployed in scenario 1 to scenario 5 while randomly 
deployed in scenario 6. 

Scenario 1. Without using the Kalman Filter, more sensors used in measurement 
results in better estimated tracking. As seen in Table 1, when the average measured 
sensor nodes increased from 4.5 to 17.5, the noise variance decreased from 21.71 X 
10~ 3 to 13.49 x 10~ 3 . However, the trade off is the total power consumption of the 
network increases from 1.38xl0 5 to 2.09x 10 5 (mW). The power consumption 
analysis is shown in Fig. 3. 



Table 1. Performance analysis 



Average 


Average 


Error variance 


Error variance 


Average total power 


measured 


active 


without Kalman 


with Kalman 


consumption 


sensors 


sensors 


Filter (X10 3 ) 


Filter (X1(T 3 ) 


(mW x 10 5 ) 



4.5 


9.3 


24.71 


3.63 


17.5 


39.2 


13.49 


1.57 


60.4 


139.9 


7.03 


0.98 


130.8 


275.5 


4.62 


0.31 


279.1 


416.2 


5.43 


0.10 



1.38 
2.09 
4.48 

7.88 
12.60 



-x 10 




4 6 

Time (seconds) 

Fig. 4. Without the Kalman Filter, the line number 1, 2, 3, 4, and 5 have average measured 
sensor nodes of 4.5, 17.5, 60.4, 130.8, and 279 respectively. For the line number 3 to 5, the total 
power consumption is fluctuated because when the target moves close to the boundary the 
number of active sensors is reduced. Then the total power consumption reduces. Line #1 and #2 
are quite flat because in these scenarios the relatively small cone regions result in small 
difference in the number of active sensors when the target in the middle of the field and when it 
is close to the boundary. 



Scenario 2. When the Kalman Filter is used, the variance of the estimated error is 
smaller and Figure 4 shows the smoother tracking performance compared to scenario 
1. As shown in Table 1, by using the Kalman Filter, only an average of 4.5 measured 
sensors is sufficient to achieve the error variance of 3.63 X 10~ 3 which is smaller 
than 5.43 x 10~ 3 resulted by an average of 279.1 measured sensors without using - 
Kalman filtering. 
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Scenario 3. When the number of average measured sensors and the sampling 
frequency are fixed, slower average velocity results in smaller estimated tracking 
error as shown in Fig. 5. In this scenario, the sampling frequency is 200Hz, the 
standard deviation of state noise and measurement noise are 0.01 and 0.2 respectively, 
and the average number of measured sensors is 6.3. 




10 20 30 40 50 

Average Velocity (unit/second) 



Fig. 5. Average velocity increases as the estimated error has a larger standard deviation 



^0.04 




0.1 0.2 0.3 0.4 0.5 

Standard deviation of measurement noise (unit) 

Fig. 6. When the distance measurement is subjected to a larger noise, the variance of estimated 
tracking error becomes bigger 



Scenario 4: In this scenario, the sampling frequency is kept at 200Hz, average target 
velocity is three units per second and the average number of measured sensors is 6.5. 
In Fig. 6, the standard deviation of state noise is fixed at 0.01 while the measurement 
noise has a standard deviation varying from 0.01 to 0.5. The variance of estimated 
error increases with the increase in measurement noise. In addition, with the same 
number of average measured sensors of 6.5, the smaller measurement noise leads to 
the better tracking performance. The tracking performance, shown in Figure 7, is 
better when the measurement noise is smaller. 

Scenario 5: When the master node does not share the knowledge of the target 
including the target state and the covariance matrix with the subsequent one, the 
subsequent master node has to run the Kalman Filter with the default initial 
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Time (seconds) 



Time (seconds) 



Fig. 7. The true and the estimated trajectory with different measurement noise levels. The 
standard deviation of measurement noise is 0.5 in the left side while it is 0.04 on the right side. 
(- True trajectory; — Estimated trajectory). 

conditions. Assuming that the difference between the initial position and the actual 
target position is the measurement error, the change in master nodes is indicated by 
the abrupt jumps in estimated error as shown in Fig. 8. When there is a change in the 
master node, the Kalman Filter requires some extra time steps to converge. 

Scenario 6. As shown in Fig. 10, when the sensor nodes are randomly distributed, we 
get similar results in comparison with the uniform scenario shown in Figure 3. 
However, the power consumption is not as smooth as it is in the uniform scenario. 
Due to the random nature, there are more sensor nodes covering a specific point while 
fewer sensor nodes are covering other points. In order for our algorithm to work 
effectively, at least three sensor nodes must cover each point in the sensor field. 




4 5 6 

Time (seconds) 



Fig. 8. Without sharing the state vector and covariance matrix to the subsequently master node, 
each master node has to start the Kalman Filter from scratch. The measurement noise standard 
deviation is 0.2, while the number of average measured sensor nodes is 7.6. (- True trajectory; - 
- Estimated trajectory). 



The above results show that the distributed Kalman Filter implementation in a 
WSN is successful in tracking moving targets. The tracking error is small when the 
target follows a linear trajectory while nonlinear trajectories with high target 
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velocities result in higher tracking errors. However, in all these scenarios, the tracking 
error is 12.5% smaller than that obtained in the absence of the Kalman Filter. In 
addition to the improved tracking performance, the distributed filter requires fewer 
nodes to be active at any given instant, thereby reducing the overall power 
consumption of the WSN. This is significant because the lowered power consumption 
increases the useful life of the WSN. 




Time (seconds) 



Fig. 9. Without sharing the state vector and covariance matrix to the subsequently master node, 
each master node has to start the Kalman Filter from scratch. The measurement noise standard 
deviation is 0.2, while the number of average measured sensor nodes is 7.6. 



-«i<r 







2 4 6 8 

Time (seconds) 



Fig. 10. Power consumption of one sampling cycle in random deployment. There are 441 
sensor nodes deployed in the sensor field of 10 x 10. The line number 1, 2, 3, 4 and 5 have 
average measured sensors of 3.4, 15.7 59.5, 127.2, and 259.7 respectively. . (- True trajectory; - 
- Estimated trajectory). 



6 Conclusions 



In this paper, a distributed computation approach is proposed for the tracking dynamic 
targets using a Wireless Sensor Network (WSN). The tracking problem is 
mathematically formulated and the tracking error of the distributed Kalman Filter is 
rigorously analyzed. It is shown that the proposed algorithm is stable and can track 
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the target with predetermined error bounds on the performance. The algorithm is also 
robust to changes in the velocity of the target and measurement noises. It is shown 
that the algorithm reduces the total power consumption in the network compared to 
similar algorithms reported in literature. The theoretical proofs are augments by 
numerical simulations that demonstrate the flexibility and power of the proposed 
tracking algorithm. 
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